Welcome to Francis Academic Press

Academic Journal of Business & Management, 2022, 4(4); doi: 10.25236/AJBM.2022.040417.

Currency Investment Strategy Based on State-Action-Reward-State-Action


Penglan Liu, Xuyu Hou

Corresponding Author:
​Penglan Liu

School of Electronic and Information Engineering, Liaoning Technical University, Huludao, 125105, Liaoning, China


The investment products emerge in the market to earn the difference of profit in the space of currency appreciation. As the bull market peaked, bitcoin prices gradually fell. As a new thing, the lack of the substantial economic support, and the price fluctuations are inevitable. To avoid risk whenever possible, we propose an online reinforcement learning model-Sarsa (State-Action-Reward-State-Action). The model has the characteristics of online learning, sample small batch data from the past. At the same time, it also has the characteristics of intensive learning, which can try to explore in the past experiences, so as to learn to avoid risks and improve the ability of investment efficiency. we compare the our Sarsa model with the latest-based LSTM portfolio model. This experiment proves that the model has minimum regret value and small sensitivity to the proportion of transaction commission.


Currency Investment; Portfolio Optimization; Reinforcement Learning; Transaction Commission

Cite This Paper

Penglan Liu, Xuyu Hou. Currency Investment Strategy Based on State-Action-Reward-State-Action. Academic Journal of Business & Management (2022) Vol. 4, Issue 4: 83-88. https://doi.org/10.25236/AJBM.2022.040417.


[1] W. Z. Song. Digital money portfolio strategy research-based on deep intensive learning methods. Nanjing University of Information Engineering, 2019. 

[2] Q. Deng, S. Z. Chen, B. Hu, et. al. A study on Markov decision process in different wireless networks. Journal of Communications, 2020, 31(12): 25-36. 

[3] S. J. Luo. Deep learning-based gold futures price prediction. Lanzhou University, 2020. 

[4] H. Di, X. J. Zhao, Z. L. Zhang. Research on commodity futures investment strategy based on LSTM-Adaboost model. Southern Finance, 2021, (08): 62-76. 

[5] J. C. Hu. Research on production schedul system based on learning effect. Shandong University, 2021. 

[6] Y. Wang. Application of particle group algorithm in network structure damage recognition. Beijing University of Architecture, 2020.