阿里Seata真香,肝一下Saga模式源码

开发 前端
saga模式是分布式事务中使用比较多的一种模式,主要应用在多节点长流程的应用中,对一个全局事务,如果某个节点抛出了异常,则从当前这个节点依次往前补偿事务。

[[378962]]

本文转载自微信公众号「程序员jinjunzhu」,作者jinjunzhu。转载本文请联系程序员jinjunzhu公众号。

saga模式是分布式事务中使用比较多的一种模式,主要应用在多节点长流程的应用中,对一个全局事务,如果某个节点抛出了异常,则从当前这个节点依次往前补偿事务。一阶段正向服务和二阶段补偿服务都需要由业务代码来实现。今天我们就来看看它的源码实现。

状态机定义

以一个典型的电商购物流程为例,我们定义3个服务,订单服务(OrderServer),账户服务(AccountService)和库存服务(StorageService),这里我们把订单服务当做聚合服务,也就是TM。

当外部下单时,订单服务首先会创建一个订单,然后调用账户服务扣减金额,最后调用库存服务扣减库存。这个流程入下图:


 

 

seata的saga模式是基于状态机来实现了,状态机对状态的控制需要一个JSON文件,这个JSON文件定义如下:

  1.     "Name""buyGoodsOnline"
  2.     "Comment""buy a goods on line, add order, deduct account, deduct storage "
  3.     "StartState""SaveOrder"
  4.     "Version""0.0.1"
  5.     "States": { 
  6.         "SaveOrder": { 
  7.             "Type""ServiceTask"
  8.             "ServiceName""orderSave"
  9.             "ServiceMethod""saveOrder"
  10.             "CompensateState""DeleteOrder"
  11.             "Next""ChoiceAccountState"
  12.             "Input": [ 
  13.                 "$.[businessKey]"
  14.                 "$.[order]" 
  15.             ], 
  16.             "Output": { 
  17.                 "SaveOrderResult""$.#root" 
  18.             }, 
  19.             "Status": { 
  20.                 "#root == true""SU"
  21.                 "#root == false""FA"
  22.                 "$Exception{java.lang.Throwable}""UN" 
  23.             } 
  24.         }, 
  25.         "ChoiceAccountState":{ 
  26.             "Type""Choice"
  27.             "Choices":[ 
  28.                 { 
  29.                     "Expression":"[SaveOrderResult] == true"
  30.                     "Next":"ReduceAccount" 
  31.                 } 
  32.             ], 
  33.             "Default":"Fail" 
  34.         }, 
  35.         "ReduceAccount": { 
  36.             "Type""ServiceTask"
  37.             "ServiceName""accountService"
  38.             "ServiceMethod""decrease"
  39.             "CompensateState""CompensateReduceAccount"
  40.             "Next""ChoiceStorageState"
  41.             "Input": [ 
  42.                 "$.[businessKey]"
  43.                 "$.[userId]"
  44.                 "$.[money]"
  45.                 { 
  46.                     "throwException" : "$.[mockReduceAccountFail]" 
  47.                 } 
  48.             ], 
  49.             "Output": { 
  50.                 "ReduceAccountResult""$.#root" 
  51.             }, 
  52.             "Status": { 
  53.                 "#root == true""SU"
  54.                 "#root == false""FA"
  55.                 "$Exception{java.lang.Throwable}""UN" 
  56.             }, 
  57.             "Catch": [ 
  58.                 { 
  59.                     "Exceptions": [ 
  60.                         "java.lang.Throwable" 
  61.                     ], 
  62.                     "Next""CompensationTrigger" 
  63.                 } 
  64.             ] 
  65.         }, 
  66.         "ChoiceStorageState":{ 
  67.             "Type""Choice"
  68.             "Choices":[ 
  69.                 { 
  70.                     "Expression":"[ReduceAccountResult] == true"
  71.                     "Next":"ReduceStorage" 
  72.                 } 
  73.             ], 
  74.             "Default":"Fail" 
  75.         }, 
  76.         "ReduceStorage": { 
  77.             "Type""ServiceTask"
  78.             "ServiceName""storageService"
  79.             "ServiceMethod""decrease"
  80.             "CompensateState""CompensateReduceStorage"
  81.             "Input": [ 
  82.                 "$.[businessKey]"
  83.                 "$.[productId]"
  84.                 "$.[count]"
  85.                 { 
  86.                     "throwException" : "$.[mockReduceStorageFail]" 
  87.                 } 
  88.             ], 
  89.             "Output": { 
  90.                 "ReduceStorageResult""$.#root" 
  91.             }, 
  92.             "Status": { 
  93.                 "#root == true""SU"
  94.                 "#root == false""FA"
  95.                 "$Exception{java.lang.Throwable}""UN" 
  96.             }, 
  97.             "Catch": [ 
  98.                 { 
  99.                     "Exceptions": [ 
  100.                         "java.lang.Throwable" 
  101.                     ], 
  102.                     "Next""CompensationTrigger" 
  103.                 } 
  104.             ], 
  105.             "Next""Succeed" 
  106.         }, 
  107.         "DeleteOrder": { 
  108.             "Type""ServiceTask"
  109.             "ServiceName""orderSave"
  110.             "ServiceMethod""deleteOrder"
  111.             "Input": [ 
  112.                 "$.[businessKey]"
  113.                 "$.[order]" 
  114.             ] 
  115.         }, 
  116.         "CompensateReduceAccount": { 
  117.             "Type""ServiceTask"
  118.             "ServiceName""accountService"
  119.             "ServiceMethod""compensateDecrease"
  120.             "Input": [ 
  121.                 "$.[businessKey]"
  122.                 "$.[userId]"
  123.                 "$.[money]" 
  124.             ] 
  125.         }, 
  126.         "CompensateReduceStorage": { 
  127.             "Type""ServiceTask"
  128.             "ServiceName""storageService"
  129.             "ServiceMethod""compensateDecrease"
  130.             "Input": [ 
  131.                 "$.[businessKey]"
  132.                 "$.[productId]"
  133.                 "$.[count]" 
  134.             ] 
  135.         }, 
  136.         "CompensationTrigger": { 
  137.             "Type""CompensationTrigger"
  138.             "Next""Fail" 
  139.         }, 
  140.         "Succeed": { 
  141.             "Type":"Succeed" 
  142.         }, 
  143.         "Fail": { 
  144.             "Type":"Fail"
  145.             "ErrorCode""PURCHASE_FAILED"
  146.             "Message""purchase failed" 
  147.         } 
  148.     } 

状态机是运行在TM中的,也就是我们上面定义的订单服务。订单服务创建订单时需要开启一个全局事务,这时就需要启动状态机,代码如下:

  1. StateMachineEngine stateMachineEngine = (StateMachineEngine) ApplicationContextUtils.getApplicationContext().getBean("stateMachineEngine"); 
  2.  
  3. Map<String, Object> startParams = new HashMap<>(3); 
  4. String businessKey = String.valueOf(System.currentTimeMillis()); 
  5. startParams.put("businessKey", businessKey); 
  6. startParams.put("order"order); 
  7. startParams.put("mockReduceAccountFail""true"); 
  8. startParams.put("userId"order.getUserId()); 
  9. startParams.put("money"order.getPayAmount()); 
  10. startParams.put("productId"order.getProductId()); 
  11. startParams.put("count"order.getCount()); 
  12.  
  13. //sync test 
  14. StateMachineInstance inst = stateMachineEngine.startWithBusinessKey("buyGoodsOnline"null, businessKey, startParams); 

可以看到,上面代码定义的buyGoodsOnline,正是JSON文件中name的属性值。

状态机初始化

那上面创建订单代码中的stateMachineEngine这个bean是在哪里定义的呢?订单服务的demo中有一个类StateMachineConfiguration来进行定义,代码如下:

  1. public class StateMachineConfiguration { 
  2.  
  3.     @Bean 
  4.     public ThreadPoolExecutorFactoryBean threadExecutor(){ 
  5.         ThreadPoolExecutorFactoryBean threadExecutor = new ThreadPoolExecutorFactoryBean(); 
  6.         threadExecutor.setThreadNamePrefix("SAGA_ASYNC_EXE_"); 
  7.         threadExecutor.setCorePoolSize(1); 
  8.         threadExecutor.setMaxPoolSize(20); 
  9.         return threadExecutor; 
  10.     } 
  11.  
  12.     @Bean 
  13.     public DbStateMachineConfig dbStateMachineConfig(ThreadPoolExecutorFactoryBean threadExecutor, DataSource hikariDataSource) throws IOException { 
  14.         DbStateMachineConfig dbStateMachineConfig = new DbStateMachineConfig(); 
  15.         dbStateMachineConfig.setDataSource(hikariDataSource); 
  16.         dbStateMachineConfig.setThreadPoolExecutor((ThreadPoolExecutor) threadExecutor.getObject()); 
  17.     /** 
  18.      *这里配置了json文件的路径,TM在初始化的时候,会把json文件解析成StateMachineImpl类,如果数据库没有保存这个状态机,则存入数据库seata_state_machine_def表, 
  19.      *如果数据库有记录,则取最新的一条记录,并且注册到StateMachineRepositoryImpl, 
  20.      *注册的Map有2个,一个是stateMachineMapByNameAndTenant,key格式是(stateMachineName + "_" + tenantId), 
  21.      *一个是stateMachineMapById,key是stateMachine.getId() 
  22.      *具体代码见StateMachineRepositoryImpl类registryStateMachine方法 
  23.      *这个注册的触发方法在DefaultStateMachineConfig的初始化方法init(),这个类是DbStateMachineConfig的父类 
  24.      */ 
  25.         dbStateMachineConfig.setResources(new PathMatchingResourcePatternResolver().getResources("classpath*:statelang/*.json"));//json文件 
  26.         dbStateMachineConfig.setEnableAsync(true); 
  27.         dbStateMachineConfig.setApplicationId("order-server"); 
  28.         dbStateMachineConfig.setTxServiceGroup("my_test_tx_group"); 
  29.         return dbStateMachineConfig; 
  30.     } 
  31.  
  32.     @Bean 
  33.     public ProcessCtrlStateMachineEngine stateMachineEngine(DbStateMachineConfig dbStateMachineConfig){ 
  34.         ProcessCtrlStateMachineEngine stateMachineEngine = new ProcessCtrlStateMachineEngine(); 
  35.         stateMachineEngine.setStateMachineConfig(dbStateMachineConfig); 
  36.         return stateMachineEngine; 
  37.     } 
  38.  
  39.     @Bean 
  40.     public StateMachineEngineHolder stateMachineEngineHolder(ProcessCtrlStateMachineEngine stateMachineEngine){ 
  41.         StateMachineEngineHolder stateMachineEngineHolder = new StateMachineEngineHolder(); 
  42.         stateMachineEngineHolder.setStateMachineEngine(stateMachineEngine); 
  43.         return stateMachineEngineHolder; 
  44.     } 

可以看到,我们在DbStateMachineConfig中配置了状态机的json文件,同时配置了applicationId和txServiceGroup。在DbStateMachineConfig初始化的时候,子类DefaultStateMachineConfig的init的方法会把json文件解析成状态机,并注册。

注册的过程中往seata_state_machine_def这张表里插入了1条记录,表里的content字段保存了我们的JOSON文件内容,其他字段值数据如下图:

 

附:根据前面的JSON文件,我们debug跟踪到的StateMachineImpl的内容如下:

  1. id = null 
  2. tenantId = null 
  3. appName = "SEATA" 
  4. name = "buyGoodsOnline" 
  5. comment = "buy a goods on line, add order, deduct account, deduct storage " 
  6. version = "0.0.1" 
  7. startState = "SaveOrder" 
  8. status = {StateMachine$Status@9135} "AC" 
  9. recoverStrategy = null 
  10. isPersist = true 
  11. type = "STATE_LANG" 
  12. content = null 
  13. gmtCreate = null 
  14. states = {LinkedHashMap@9137}  size = 11 
  15.    "SaveOrder" -> {ServiceTaskStateImpl@9153}  
  16.    "ChoiceAccountState" -> {ChoiceStateImpl@9155}  
  17.    "ReduceAccount" -> {ServiceTaskStateImpl@9157}  
  18.    "ChoiceStorageState" -> {ChoiceStateImpl@9159}  
  19.    "ReduceStorage" -> {ServiceTaskStateImpl@9161}  
  20.    "DeleteOrder" -> {ServiceTaskStateImpl@9163}  
  21.    "CompensateReduceAccount" -> {ServiceTaskStateImpl@9165}  
  22.    "CompensateReduceStorage" -> {ServiceTaskStateImpl@9167}  
  23.    "CompensationTrigger" -> {CompensationTriggerStateImpl@9169}  
  24.    "Succeed" -> {SucceedEndStateImpl@9171}  
  25.    "Fail" -> {FailEndStateImpl@9173} 

启动状态机

在第一节创建订单的代码中,startWithBusinessKey方法进行了整个事务的启动,这个方法还有一个异步模式startWithBusinessKeyAsync,这里我们只分析同步模式,源代码如下:

  1. public StateMachineInstance startWithBusinessKey(String stateMachineName, String tenantId, String businessKey, 
  2.                                                  Map<String, Object> startParams) throws EngineExecutionException { 
  3.     return startInternal(stateMachineName, tenantId, businessKey, startParams, falsenull); 
  4. private StateMachineInstance startInternal(String stateMachineName, String tenantId, String businessKey, 
  5.                                            Map<String, Object> startParams, boolean async, AsyncCallback callback) 
  6.     throws EngineExecutionException { 
  7.     //省略部分源代码 
  8.   //创建一个状态机实例 
  9.   //默认值tenantId="000001" 
  10.     StateMachineInstance instance = createMachineInstance(stateMachineName, tenantId, businessKey, startParams); 
  11.  
  12.     /** 
  13.    * ProcessType.STATE_LANG这个枚举只有一个元素 
  14.    * OPERATION_NAME_START = "start" 
  15.    * callback是null 
  16.    * getStateMachineConfig()返回DbStateMachineConfig 
  17.    */ 
  18.     ProcessContextBuilder contextBuilder = ProcessContextBuilder.create().withProcessType(ProcessType.STATE_LANG) 
  19.         .withOperationName(DomainConstants.OPERATION_NAME_START).withAsyncCallback(callback).withInstruction( 
  20.             new StateInstruction(stateMachineName, tenantId)).withStateMachineInstance(instance) 
  21.         .withStateMachineConfig(getStateMachineConfig()).withStateMachineEngine(this); 
  22.  
  23.     Map<String, Object> contextVariables; 
  24.     if (startParams != null) { 
  25.         contextVariables = new ConcurrentHashMap<>(startParams.size()); 
  26.         nullSafeCopy(startParams, contextVariables); 
  27.     } else { 
  28.         contextVariables = new ConcurrentHashMap<>(); 
  29.     } 
  30.     instance.setContext(contextVariables);//把启动参数赋值给状态机实例的context 
  31.     //给ProcessContextImpl的variables加参数 
  32.     contextBuilder.withStateMachineContextVariables(contextVariables); 
  33.  
  34.     contextBuilder.withIsAsyncExecution(async); 
  35.  
  36.     //上面定义的建造者创建一个ProcessContextImpl 
  37.     ProcessContext processContext = contextBuilder.build(); 
  38.  
  39.     //这个条件是true 
  40.     if (instance.getStateMachine().isPersist() && stateMachineConfig.getStateLogStore() != null) { 
  41.       //记录状态机开始状态 
  42.         stateMachineConfig.getStateLogStore().recordStateMachineStarted(instance, processContext); 
  43.     } 
  44.     if (StringUtils.isEmpty(instance.getId())) { 
  45.         instance.setId( 
  46.             stateMachineConfig.getSeqGenerator().generate(DomainConstants.SEQ_ENTITY_STATE_MACHINE_INST)); 
  47.     } 
  48.  
  49.     if (async) { 
  50.         stateMachineConfig.getAsyncProcessCtrlEventPublisher().publish(processContext); 
  51.     } else { 
  52.       //发送消息到EventBus,这里的消费者是ProcessCtrlEventConsumer,在DefaultStateMachineConfig初始化时设置 
  53.         stateMachineConfig.getProcessCtrlEventPublisher().publish(processContext); 
  54.     } 
  55.  
  56.     return instance; 

上面的代码中我们可以看出,启动状态记得时候主要做了2件事情,一个是记录状态机开始的状态,一个是发送消息到EventBus,下面我们详细看一下这2个过程。

开启全局事务

上面的代码分析中,有一个记录状态机开始状态的代码,如下:

  1. stateMachineConfig.getStateLogStore().recordStateMachineStarted(instance, processContext); 

这里调用了类DbAndReportTcStateLogStore的recordStateMachineStarted方法,我们来看一下,代码如下:

  1. public void recordStateMachineStarted(StateMachineInstance machineInstance, ProcessContext context) { 
  2.  
  3.     if (machineInstance != null) { 
  4.         //if parentId is not null, machineInstance is a SubStateMachine, do not start a new global transaction
  5.         //use parent transaction instead
  6.         String parentId = machineInstance.getParentId(); 
  7.         if (StringUtils.hasLength(parentId)) { 
  8.             if (StringUtils.isEmpty(machineInstance.getId())) { 
  9.                 machineInstance.setId(parentId); 
  10.             } 
  11.         } else { 
  12.         //走这个分支,因为没有配置子状态机 
  13.         /** 
  14.              * 这里的beginTransaction就是开启全局事务, 
  15.        * 这里是调用TC开启全局事务 
  16.              */ 
  17.             beginTransaction(machineInstance, context); 
  18.         } 
  19.  
  20.  
  21.         if (StringUtils.isEmpty(machineInstance.getId()) && seqGenerator != null) { 
  22.             machineInstance.setId(seqGenerator.generate(DomainConstants.SEQ_ENTITY_STATE_MACHINE_INST)); 
  23.         } 
  24.  
  25.         // save to db 
  26.     //dbType = "MySQL" 
  27.         machineInstance.setSerializedStartParams(paramsSerializer.serialize(machineInstance.getStartParams())); 
  28.         executeUpdate(stateLogStoreSqls.getRecordStateMachineStartedSql(dbType), 
  29.                 STATE_MACHINE_INSTANCE_TO_STATEMENT_FOR_INSERT, machineInstance); 
  30.     } 

上面executeUpdate方法在子类AbstractStore,debug一下executeUpdate这个方法可以看到,这里执行的sql如下:

  1. INSERT INTO seata_state_machine_inst 
  2. (id, machine_id, tenant_id, parent_id, gmt_started, business_key, start_params, is_running, status, gmt_updated) 
  3. VALUES ('192.168.59.146:8091:65853497147990016''06a098cab53241ca7ed09433342e9f07''000001'null'2020-10-31 17:18:24.773',  
  4. '1604135904773', '{"@type":"java.util.HashMap","money":50.,"productId":1L,"_business_key_":"1604135904773","businessKey":"1604135904773"
  5. "count":1,"mockReduceAccountFail":"true","userId":1L,"order":{"@type":"io.seata.sample.entity.Order","count":1,"payAmount":50, 
  6. "productId":1,"userId":1}}', 1, 'RU', '2020-10-31 17:18:24.773') 

可以看到,这个全局事务记录在了表seata_state_machine_inst,记录的是我们启动状态机的参数,status记录的状态是"RU"也就是RUNNING。

分支事务处理

上一节我们提到,启动状态机后,向EventBus发了一条消息,这个消息的消费者是ProcessCtrlEventConsumer,我们看一下这个类的代码:

  1. public class ProcessCtrlEventConsumer implements EventConsumer<ProcessContext> { 
  2.  
  3.     private ProcessController processController; 
  4.  
  5.     @Override 
  6.     public void process(ProcessContext event) throws FrameworkException { 
  7.         //这里的processController是ProcessControllerImpl 
  8.         processController.process(event); 
  9.     } 
  10.  
  11.     @Override 
  12.     public boolean accept(Class<ProcessContext> clazz) { 
  13.         return ProcessContext.class.isAssignableFrom(clazz); 
  14.     } 
  15.  
  16.     public void setProcessController(ProcessController processController) { 
  17.         this.processController = processController; 
  18.     } 

ProcessControllerImpl类的process方法有2个处理逻辑,process和route,代码如下:

  1. public void process(ProcessContext context) throws FrameworkException { 
  2.  
  3.     try { 
  4.         //这里的businessProcessor是CustomizeBusinessProcessor 
  5.         businessProcessor.process(context); 
  6.  
  7.         businessProcessor.route(context); 
  8.  
  9.     } catch (FrameworkException fex) { 
  10.         throw fex; 
  11.     } catch (Exception ex) { 
  12.         LOGGER.error("Unknown exception occurred, context = {}", context, ex); 
  13.         throw new FrameworkException(ex, "Unknown exception occurred", FrameworkErrorCode.UnknownAppError); 
  14.     } 

这里的处理逻辑有些复杂,先上一张UML类图,跟着这张图,可以捋清楚代码的调用逻辑:


 

 

我们先来看一下CustomizeBusinessProcessor中的process方法:

  1. public void process(ProcessContext context) throws FrameworkException { 
  2.  
  3.     /** 
  4.     *processType = {ProcessType@10310} "STATE_LANG" 
  5.     *code = "STATE_LANG" 
  6.     *message = "SEATA State Language" 
  7.     *name = "STATE_LANG" 
  8.     *ordinal = 0 
  9.     */ 
  10.     ProcessType processType = matchProcessType(context); 
  11.     if (processType == null) { 
  12.         if (LOGGER.isWarnEnabled()) { 
  13.             LOGGER.warn("Process type not found, context= {}", context); 
  14.         } 
  15.         throw new FrameworkException(FrameworkErrorCode.ProcessTypeNotFound); 
  16.     } 
  17.  
  18.     ProcessHandler processor = processHandlers.get(processType.getCode()); 
  19.     if (processor == null) { 
  20.         LOGGER.error("Cannot find process handler by type {}, context= {}", processType.getCode(), context); 
  21.         throw new FrameworkException(FrameworkErrorCode.ProcessHandlerNotFound); 
  22.     } 
  23.     //这里的是StateMachineProcessHandler 
  24.     processor.process(context); 

这里的代码不好理解,我们分四步来研究。

第一步,我们看一下StateMachineProcessHandler类中process方法,这个方法代理了ServiceTaskStateHandler的process方法,代码如下:

  1. public void process(ProcessContext context) throws FrameworkException { 
  2.     /** 
  3.    * instruction = {StateInstruction@11057}  
  4.    * stateName = null 
  5.    * stateMachineName = "buyGoodsOnline" 
  6.    * tenantId = "000001" 
  7.    * end = false 
  8.    * temporaryState = null 
  9.     */ 
  10.     StateInstruction instruction = context.getInstruction(StateInstruction.class); 
  11.   //这里的state实现类是ServiceTaskStateImpl 
  12.     State state = instruction.getState(context); 
  13.     String stateType = state.getType(); 
  14.   //这里stateHandler实现类是ServiceTaskStateHandler 
  15.     StateHandler stateHandler = stateHandlers.get(stateType); 
  16.  
  17.     List<StateHandlerInterceptor> interceptors = null
  18.     if (stateHandler instanceof InterceptableStateHandler) { 
  19.       //list上有1个元素ServiceTaskHandlerInterceptor 
  20.         interceptors = ((InterceptableStateHandler)stateHandler).getInterceptors(); 
  21.     } 
  22.  
  23.     List<StateHandlerInterceptor> executedInterceptors = null
  24.     Exception exception = null
  25.     try { 
  26.         if (interceptors != null && interceptors.size() > 0) { 
  27.             executedInterceptors = new ArrayList<>(interceptors.size()); 
  28.             for (StateHandlerInterceptor interceptor : interceptors) { 
  29.                 executedInterceptors.add(interceptor); 
  30.                 interceptor.preProcess(context); 
  31.             } 
  32.         } 
  33.  
  34.         stateHandler.process(context); 
  35.  
  36.     } catch (Exception e) { 
  37.         exception = e; 
  38.         throw e; 
  39.     } finally { 
  40.  
  41.         if (executedInterceptors != null && executedInterceptors.size() > 0) { 
  42.             for (int i = executedInterceptors.size() - 1; i >= 0; i--) { 
  43.                 StateHandlerInterceptor interceptor = executedInterceptors.get(i); 
  44.                 interceptor.postProcess(context, exception); 
  45.             } 
  46.         } 
  47.     } 

从这个方法我们看到,代理对stateHandler.process加入了前置和后置增强,增强类是ServiceTaskHandlerInterceptor,前置后置增强分别调用了interceptor的preProcess和postProcess。

第二步,我们来看一下增强逻辑。ServiceTaskHandlerInterceptor的preProcess和postProcess方法,代码如下:

  1. public class ServiceTaskHandlerInterceptor implements StateHandlerInterceptor { 
  2.     //省略部分代码 
  3.     @Override 
  4.     public void preProcess(ProcessContext context) throws EngineExecutionException { 
  5.  
  6.         StateInstruction instruction = context.getInstruction(StateInstruction.class); 
  7.  
  8.         StateMachineInstance stateMachineInstance = (StateMachineInstance)context.getVariable( 
  9.             DomainConstants.VAR_NAME_STATEMACHINE_INST); 
  10.         StateMachineConfig stateMachineConfig = (StateMachineConfig)context.getVariable( 
  11.             DomainConstants.VAR_NAME_STATEMACHINE_CONFIG); 
  12.  
  13.         //如果超时,修改状态机状态为FA 
  14.         if (EngineUtils.isTimeout(stateMachineInstance.getGmtUpdated(), stateMachineConfig.getTransOperationTimeout())) { 
  15.             String message = "Saga Transaction [stateMachineInstanceId:" + stateMachineInstance.getId() 
  16.                     + "] has timed out, stop execution now."
  17.             EngineUtils.failStateMachine(context, exception); 
  18.             throw exception; 
  19.         } 
  20.  
  21.         StateInstanceImpl stateInstance = new StateInstanceImpl(); 
  22.  
  23.         Map<String, Object> contextVariables = (Map<String, Object>)context.getVariable( 
  24.             DomainConstants.VAR_NAME_STATEMACHINE_CONTEXT); 
  25.         ServiceTaskStateImpl state = (ServiceTaskStateImpl)instruction.getState(context); 
  26.         List<Object> serviceInputParams = null
  27.  
  28.         Object isForCompensation = state.isForCompensation(); 
  29.         if (isForCompensation != null && (Boolean)isForCompensation) { 
  30.             CompensationHolder compensationHolder = CompensationHolder.getCurrent(context, true); 
  31.             StateInstance stateToBeCompensated = compensationHolder.getStatesNeedCompensation().get(state.getName()); 
  32.             if (stateToBeCompensated != null) { 
  33.  
  34.                 stateToBeCompensated.setCompensationState(stateInstance); 
  35.                 stateInstance.setStateIdCompensatedFor(stateToBeCompensated.getId()); 
  36.             } else { 
  37.                 LOGGER.error("Compensation State[{}] has no state to compensate, maybe this is a bug."
  38.                     state.getName()); 
  39.             } 
  40.       //加入补偿集合 
  41.             CompensationHolder.getCurrent(context, true).addForCompensationState(stateInstance.getName(), 
  42.                 stateInstance); 
  43.         } 
  44.         //省略部分代码 
  45.         stateInstance.setInputParams(serviceInputParams); 
  46.  
  47.         if (stateMachineInstance.getStateMachine().isPersist() && state.isPersist() 
  48.             && stateMachineConfig.getStateLogStore() != null) { 
  49.  
  50.             try { 
  51.           //记录一个分支事务的状态RU到数据库 
  52.         /** 
  53.           *INSERT INTO seata_state_inst (id, machine_inst_id, name, type, gmt_started, service_name, service_method, service_type, is_for_update, input_params, status, business_key, state_id_compensated_for, state_id_retried_for) 
  54.                   *VALUES ('4fe5f602452c84ba5e88fd2ee9c13b35''192.168.59.146:8091:65853497147990016''SaveOrder''ServiceTask''2020-10-31 17:18:40.84''orderSave',  
  55.           *'saveOrder'null, 1, '["1604135904773",{"@type":"io.seata.sample.entity.Order","count":1,"payAmount":50,"productId":1,"userId":1}]''RU'nullnullnull
  56.           */ 
  57.                 stateMachineConfig.getStateLogStore().recordStateStarted(stateInstance, context); 
  58.             } 
  59.         } 
  60.         //省略部分代码 
  61.         stateMachineInstance.putStateInstance(stateInstance.getId(), stateInstance);//放入StateMachineInstanceImpl的stateMap用于重试或交易补偿 
  62.         ((HierarchicalProcessContext)context).setVariableLocally(DomainConstants.VAR_NAME_STATE_INST, stateInstance);//记录状态后面传给TaskStateRouter判断全局事务结束 
  63.     } 
  64.  
  65.     @Override 
  66.     public void postProcess(ProcessContext context, Exception exp) throws EngineExecutionException { 
  67.  
  68.         StateInstruction instruction = context.getInstruction(StateInstruction.class); 
  69.         ServiceTaskStateImpl state = (ServiceTaskStateImpl)instruction.getState(context); 
  70.  
  71.         StateMachineInstance stateMachineInstance = (StateMachineInstance)context.getVariable( 
  72.             DomainConstants.VAR_NAME_STATEMACHINE_INST); 
  73.         StateInstance stateInstance = (StateInstance)context.getVariable(DomainConstants.VAR_NAME_STATE_INST); 
  74.         if (stateInstance == null || !stateMachineInstance.isRunning()) { 
  75.             LOGGER.warn("StateMachineInstance[id:" + stateMachineInstance.getId() + "] is end. stop running"); 
  76.             return
  77.         } 
  78.  
  79.         StateMachineConfig stateMachineConfig = (StateMachineConfig)context.getVariable( 
  80.             DomainConstants.VAR_NAME_STATEMACHINE_CONFIG); 
  81.  
  82.         if (exp == null) { 
  83.             exp = (Exception)context.getVariable(DomainConstants.VAR_NAME_CURRENT_EXCEPTION); 
  84.         } 
  85.         stateInstance.setException(exp); 
  86.  
  87.         //设置事务状态 
  88.         decideExecutionStatus(context, stateInstance, state, exp); 
  89.         //省略部分代码 
  90.  
  91.         Map<String, Object> contextVariables = (Map<String, Object>)context.getVariable( 
  92.             DomainConstants.VAR_NAME_STATEMACHINE_CONTEXT); 
  93.         //省略部分代码 
  94.  
  95.         context.removeVariable(DomainConstants.VAR_NAME_OUTPUT_PARAMS); 
  96.         context.removeVariable(DomainConstants.VAR_NAME_INPUT_PARAMS); 
  97.  
  98.         stateInstance.setGmtEnd(new Date()); 
  99.  
  100.         if (stateMachineInstance.getStateMachine().isPersist() && state.isPersist() 
  101.             && stateMachineConfig.getStateLogStore() != null) { 
  102.       //更新分支事务的状态为成功 
  103.       /** 
  104.         * UPDATE seata_state_inst SET gmt_end = '2020-10-31 17:18:49.919', excep = null, status = 'SU',  
  105.         * output_params = 'true' WHERE id = '4fe5f602452c84ba5e88fd2ee9c13b35' AND  
  106.         * machine_inst_id = '192.168.59.146:8091:65853497147990016' 
  107.               */ 
  108.             stateMachineConfig.getStateLogStore().recordStateFinished(stateInstance, context); 
  109.         } 
  110.         //省略部分代码 
  111.     } 

从这个代码我们能看到,分支事务执行前,封装了一个StateInstanceImpl赋值给了ProcessContext,分支事务执行后,对这个StateInstanceImpl进行了修改,这个StateInstanceImpl有3个作用:

传入StateMachineInstanceImpl的stateMap用于重试或交易补偿

记录了分支事务的执行情况,同时支持持久化到seata_state_inst表

传入TaskStateRouter用作判断全局事务结束

第三步,我们看一下被代理的方法stateHandler.process(context),正常执行逻辑中stateHandler的实现类是ServiceTaskStateHandler,代码如下:

  1. public void process(ProcessContext context) throws EngineExecutionException { 
  2.  
  3.     StateInstruction instruction = context.getInstruction(StateInstruction.class); 
  4.     ServiceTaskStateImpl state = (ServiceTaskStateImpl) instruction.getState(context); 
  5.     StateInstance stateInstance = (StateInstance) context.getVariable(DomainConstants.VAR_NAME_STATE_INST); 
  6.  
  7.     Object result; 
  8.     try { 
  9.         /** 
  10.      * 这里的input是我们在JSON中定义的,比如orderSave这个ServiceTask,input如下: 
  11.      * 0 = "1608714480316" 
  12.      * 1 = {Order@11271} "Order(id=null, userId=1, productId=1, count=1, payAmount=50, status=null)" 
  13.      * JSON中定义如下: 
  14.      * "Input": [ 
  15.          *     "$.[businessKey]"
  16.          *     "$.[order]" 
  17.          * ] 
  18.      */ 
  19.         List<Object> input = (List<Object>) context.getVariable(DomainConstants.VAR_NAME_INPUT_PARAMS); 
  20.  
  21.         //Set the current task execution status to RU (Running) 
  22.         stateInstance.setStatus(ExecutionStatus.RU);//设置状态 
  23.  
  24.         if (state instanceof CompensateSubStateMachineState) { 
  25.             //省略子状态机的研究 
  26.         } else { 
  27.             StateMachineConfig stateMachineConfig = (StateMachineConfig) context.getVariable( 
  28.                     DomainConstants.VAR_NAME_STATEMACHINE_CONFIG); 
  29.             //这里的state.getServiceType是springBean 
  30.             ServiceInvoker serviceInvoker = stateMachineConfig.getServiceInvokerManager().getServiceInvoker( 
  31.                     state.getServiceType()); 
  32.             if (serviceInvoker == null) { 
  33.                 throw new EngineExecutionException("No such ServiceInvoker[" + state.getServiceType() + "]"
  34.                         FrameworkErrorCode.ObjectNotExists); 
  35.             } 
  36.             if (serviceInvoker instanceof ApplicationContextAware) { 
  37.                 ((ApplicationContextAware) serviceInvoker).setApplicationContext( 
  38.                         stateMachineConfig.getApplicationContext()); 
  39.             } 
  40.             //这里触发了我们在JSON中定义ServiceTask中方法,比如orderSave中的saveOrder方法 
  41.             result = serviceInvoker.invoke(state, input.toArray()); 
  42.         } 
  43.  
  44.         if (LOGGER.isDebugEnabled()) { 
  45.             LOGGER.debug("<<<<<<<<<<<<<<<<<<<<<< State[{}], ServiceName[{}], Method[{}] Execute finish. result: {}"
  46.                     state.getName(), serviceName, methodName, result); 
  47.         } 
  48.     //省略部分代码 
  49.  
  50.     }  
  51.   //省略异常处理代码 

可以看到,process这个方法是一个核心的业务处理,它用发射触发了JSON中定义ServiceTask的方法,并且根据状态触发了Next对象,即流程中的下一个ServiceTask。

第四步,我们再看一下CustomizeBusinessProcessor的route方法,代码如下:

  1. public void route(ProcessContext context) throws FrameworkException { 
  2.  
  3.     //code = "STATE_LANG" 
  4.     //message = "SEATA State Language" 
  5.     //name = "STATE_LANG" 
  6.     //ordinal = 0 
  7.     ProcessType processType = matchProcessType(context); 
  8.  
  9.     RouterHandler router = routerHandlers.get(processType.getCode()); 
  10.     //DefaultRouterHandler的route方法 
  11.     router.route(context); 

我们看一下DefaultRouterHandler的route方法,代码如下:

  1. public void route(ProcessContext context) throws FrameworkException { 
  2.  
  3.     try { 
  4.         ProcessType processType = matchProcessType(context); 
  5.         //这里的processRouter是StateMachineProcessRouter 
  6.         ProcessRouter processRouter = processRouters.get(processType.getCode()); 
  7.         Instruction instruction = processRouter.route(context); 
  8.         if (instruction == null) { 
  9.             LOGGER.info("route instruction is null, process end"); 
  10.         } else { 
  11.             context.setInstruction(instruction); 
  12.  
  13.             eventPublisher.publish(context); 
  14.         } 
  15.     } catch (FrameworkException e) { 
  16.         throw e; 
  17.     } catch (Exception ex) { 
  18.         throw new FrameworkException(ex, ex.getMessage(), FrameworkErrorCode.UnknownAppError); 
  19.     } 

看一下StateMachineProcessRouter的route方法,这里也是用了代理模式,代码如下:

  1. public Instruction route(ProcessContext context) throws FrameworkException { 
  2.  
  3.     StateInstruction stateInstruction = context.getInstruction(StateInstruction.class); 
  4.  
  5.     State state; 
  6.     if (stateInstruction.getTemporaryState() != null) { 
  7.         state = stateInstruction.getTemporaryState(); 
  8.         stateInstruction.setTemporaryState(null); 
  9.     } else { 
  10.       //走这个分支 
  11.         StateMachineConfig stateMachineConfig = (StateMachineConfig)context.getVariable( 
  12.             DomainConstants.VAR_NAME_STATEMACHINE_CONFIG); 
  13.         StateMachine stateMachine = stateMachineConfig.getStateMachineRepository().getStateMachine( 
  14.             stateInstruction.getStateMachineName(), stateInstruction.getTenantId()); 
  15.         state = stateMachine.getStates().get(stateInstruction.getStateName()); 
  16.     } 
  17.  
  18.     String stateType = state.getType(); 
  19.  
  20.     StateRouter router = stateRouters.get(stateType); 
  21.  
  22.     Instruction instruction = null
  23.  
  24.     List<StateRouterInterceptor> interceptors = null
  25.     if (router instanceof InterceptableStateRouter) { 
  26.       //这里只有EndStateRouter 
  27.         interceptors = ((InterceptableStateRouter)router).getInterceptors();//EndStateRouterInterceptor 
  28.     } 
  29.  
  30.     List<StateRouterInterceptor> executedInterceptors = null
  31.     Exception exception = null
  32.     try { 
  33.         //前置增量实现方法是空,这里省略代码 
  34.         instruction = router.route(context, state); 
  35.  
  36.     } catch (Exception e) { 
  37.         exception = e; 
  38.         throw e; 
  39.     } finally { 
  40.  
  41.         if (executedInterceptors != null && executedInterceptors.size() > 0) { 
  42.             for (int i = executedInterceptors.size() - 1; i >= 0; i--) { 
  43.                 StateRouterInterceptor interceptor = executedInterceptors.get(i); 
  44.                 interceptor.postRoute(context, state, instruction, exception);//结束状态机 
  45.             } 
  46.         } 
  47.  
  48.         //if 'Succeed' or 'Fail' State did not configured, we must end the state machine 
  49.         if (instruction == null && !stateInstruction.isEnd()) { 
  50.             EngineUtils.endStateMachine(context); 
  51.         } 
  52.     } 
  53.  
  54.     return instruction; 

这里的代理只实现了一个后置增强,做的事情就是结束状态机。

下面我们来看一下StateRouter,UML类图如下:

 

从UML类图我们看到,除了EndStateRouter,只有一个TaskStateRouter了。而EndStateRouter并没有做什么事情,因为关闭状态机的逻辑已经由代理做了。这里我们看一下TaskStateRouter,代码如下:

  1. public Instruction route(ProcessContext context, State state) throws EngineExecutionException { 
  2.  
  3.     StateInstruction stateInstruction = context.getInstruction(StateInstruction.class); 
  4.     if (stateInstruction.isEnd()) { 
  5.       //如果已经结束,直接返回 
  6.         //省略代码 
  7.     } 
  8.  
  9.     //The current CompensationTriggerState can mark the compensation process is started and perform compensation 
  10.     // route processing. 
  11.     State compensationTriggerState = (State)context.getVariable( 
  12.         DomainConstants.VAR_NAME_CURRENT_COMPEN_TRIGGER_STATE); 
  13.     if (compensationTriggerState != null) { 
  14.       //加入补偿集合进行补偿并返回 
  15.         return compensateRoute(context, compensationTriggerState); 
  16.     } 
  17.  
  18.     //There is an exception route, indicating that an exception is thrown, and the exception route is prioritized. 
  19.     String next = (String)context.getVariable(DomainConstants.VAR_NAME_CURRENT_EXCEPTION_ROUTE); 
  20.  
  21.     if (StringUtils.hasLength(next)) { 
  22.         context.removeVariable(DomainConstants.VAR_NAME_CURRENT_EXCEPTION_ROUTE); 
  23.     } else { 
  24.         next = state.getNext(); 
  25.     } 
  26.  
  27.     //If next is empty, the state selected by the Choice state was taken. 
  28.     if (!StringUtils.hasLength(next) && context.hasVariable(DomainConstants.VAR_NAME_CURRENT_CHOICE)) { 
  29.         next = (String)context.getVariable(DomainConstants.VAR_NAME_CURRENT_CHOICE); 
  30.         context.removeVariable(DomainConstants.VAR_NAME_CURRENT_CHOICE); 
  31.     } 
  32.     //从当前context中取不出下一个节点了,直接返回 
  33.     if (!StringUtils.hasLength(next)) { 
  34.         return null
  35.     } 
  36.  
  37.     StateMachine stateMachine = state.getStateMachine(); 
  38.  
  39.     State nextState = stateMachine.getState(next); 
  40.     if (nextState == null) { 
  41.         throw new EngineExecutionException("Next state[" + next + "] is not exits"
  42.             FrameworkErrorCode.ObjectNotExists); 
  43.     } 
  44.     //获取到下一个要流转的状态并且赋值给stateInstruction 
  45.     stateInstruction.setStateName(next); 
  46.  
  47.     return stateInstruction; 

可以看到,route的作用是帮状态机确定下一个流程节点,然后放入到当前的context中的stateInstruction。

到这里,我们就分析完成了状态机的原理,ProcessControllerImpl类中。

需要注意的是,这里获取到下一个节点后,并没有直接处理,而是使用观察者模式,先发送到EventBus,等待观察者来处理,循环往复,直到EndStateRouter结束状态机。

这里观察者模式的Event是ProcessContext,里面包含了Instruction,而Instruction里面包含了State,这个State里面就决定了下一个处理的节点直到结束。UML类图如下:

 

总结

seata中间件中的saga模式使用比较广泛,但是代码还是比较复杂的。我从下面几个方面进行了梳理:

  • 我们定义的json文件加载到了类StateMachineImpl中。
  • 启动状态机,我们也就启动了全局事务,这个普通模式启动全局事务是一样的,都会向TC发送消息。
  • 处理状态机状态和控制状态流转的入口类在ProcessControllerImpl,从process方法可以跟代码。
  • ProcessControllerImpl调用CustomizeBusinessProcessor的process处理当前状态,然后调用route方法获取到下一个节点并发送到EventBus。

saga模式额外引入了3张表,我们也可以根据跟全局事务和分支事务相关的2张表来跟踪代码,我之前给出的demo,如果事务成功,这2张表的写sql按照状态机执行顺序给出一个成功sql,代码如下:

  1. INSERT INTO seata_state_machine_inst 
  2. (id, machine_id, tenant_id, parent_id, gmt_started, business_key, start_params, is_running, status, gmt_updated) 
  3. VALUES ('192.168.59.146:8091:65853497147990016''06a098cab53241ca7ed09433342e9f07''000001'null'2020-10-31 17:18:24.773''1604135904773''{"@type":"java.util.HashMap","money":50.,"productId":1L,"_business_key_":"1604135904773","businessKey":"1604135904773",\"count\":1,\"mockreduceaccountfail\":\"true\","userId":1L,"order":{"@type":"io.seata.sample.entity.Order","count":1,"payAmount":50,"productId":1,"userId":1}}', 1, 'RU''2020-10-31 17:18:24.773'
  4.  
  5. INSERT INTO seata_state_inst (id, machine_inst_id, name, type, gmt_started, service_name, service_method, service_type, is_for_update, input_params, status, business_key, state_id_compensated_for, state_id_retried_for) 
  6. VALUES ('4fe5f602452c84ba5e88fd2ee9c13b35''192.168.59.146:8091:65853497147990016''SaveOrder''ServiceTask''2020-10-31 17:18:40.84''orderSave''saveOrder'null, 1, '["1604135904773",{"@type":"io.seata.sample.entity.Order","count":1,"payAmount":50,"productId":1,"userId":1}]''RU'nullnullnull
  7.  
  8. UPDATE seata_state_inst SET gmt_end = '2020-10-31 17:18:49.919', excep = null, status = 'SU', output_params = 'true' WHERE id = '4fe5f602452c84ba5e88fd2ee9c13b35' AND machine_inst_id = '192.168.59.146:8091:65853497147990016' 
  9.  
  10. INSERT INTO seata_state_inst (id, machine_inst_id, name, type, gmt_started, service_name, service_method, service_type, is_for_update, input_params, status, business_key, state_id_compensated_for, state_id_retried_for) 
  11. VALUES ('8371235cb2c66c8626e148f66123d3b4''192.168.59.146:8091:65853497147990016''ReduceAccount''ServiceTask''2020-10-31 17:19:00.441''accountService''decrease'null, 1, '["1604135904773",1L,50.,{"@type":"java.util.LinkedHashMap","throwException":"true"}]''RU'nullnullnull
  12.  
  13. UPDATE seata_state_inst SET gmt_end = '2020-10-31 17:19:09.593', excep = null, status = 'SU', output_params = 'true' WHERE id = '8371235cb2c66c8626e148f66123d3b4' AND machine_inst_id = '192.168.59.146:8091:65853497147990016' 
  14.  
  15. INSERT INTO seata_state_inst (id, machine_inst_id, name, type, gmt_started, service_name, service_method, service_type, is_for_update, input_params, status, business_key, state_id_compensated_for, state_id_retried_for) 
  16. VALUES ('e70a49f1eac72f929085f4e82c2b4de2''192.168.59.146:8091:65853497147990016''ReduceStorage''ServiceTask''2020-10-31 17:19:18.494''storageService''decrease'null, 1, '["1604135904773",1L,1,{"@type":"java.util.LinkedHashMap"}]''RU'nullnullnull
  17.  
  18. UPDATE seata_state_inst SET gmt_end = '2020-10-31 17:19:26.613', excep = null, status = 'SU', output_params = 'true' WHERE id = 'e70a49f1eac72f929085f4e82c2b4de2' AND machine_inst_id = '192.168.59.146:8091:65853497147990016' 
  19.  
  20. UPDATE seata_state_machine_inst SET gmt_end = '2020-10-31 17:19:33.581', excep = null, end_params = '{"@type":"java.util.HashMap","productId":1L,"count":1,"ReduceAccountResult":true,"mockReduceAccountFail":"true","userId":1L,"money":50.,"SaveOrderResult":true,"_business_key_":"1604135904773","businessKey":"1604135904773","ReduceStorageResult":true,"order":{"@type":"io.seata.sample.entity.Order","count":1,"id":60,"payAmount":50,"productId":1,"userId":1}}',status = 'SU', compensation_status = null, is_running = 0, gmt_updated = '2020-10-31 17:19:33.582' WHERE id = '192.168.59.146:8091:65853497147990016' and gmt_updated = '2020-10-31 17:18:24.773' 

这篇文章我主要从一个正常的流程研究了saga模式的源代码,还有好多细节没有做分析,比如全局事务失败后的回滚或补偿逻辑,以后有机会再交流。

 

 

责任编辑:武晓燕 来源: 程序员jinjunzhu
相关推荐

2022-01-12 10:02:02

TCC模式 Seata

2023-11-09 17:29:06

2022-03-24 07:51:27

seata分布式事务Java

2022-03-07 06:34:22

CQRS数据库数据模型

2021-11-14 16:07:35

中间件阿里Seata

2021-08-07 07:56:59

Node逻辑对象

2022-03-10 07:39:33

.NET部署模式

2020-04-28 12:18:08

Seata模式分布式

2021-02-26 13:59:41

RocketMQProducer底层

2022-12-02 09:13:28

SeataAT模式

2020-05-18 08:11:57

Spring循环依赖

2024-03-12 08:35:47

分布式事务中间件SeataSaga

2022-06-21 08:27:22

Seata分布式事务

2010-03-17 17:33:47

云计算

2022-03-24 13:36:18

Java悲观锁乐观锁

2023-09-05 09:42:18

if分支源码

2023-12-29 18:53:58

微服务Saga模式

2023-11-29 08:00:53

JavaTreeMap底层

2021-04-21 14:19:52

javaignalHandle接口

2021-03-15 09:44:39

Broker源码RocketMQ
点赞
收藏

51CTO技术栈公众号