Academic Journal of Engineering and Technology Science, 2021, 4(3); doi: 10.25236/AJETS.2021.040306.
The Stony Brook School, Stony Brook, NY, USA
Traditionally, robotic arms on production lines perform actions by computing rotational matrices, which control the movement of each joint and repeat their work by following commands that were previously programmed. Although with accuracy and efficiency, traditional robotic arms are unable to complete their tasks when the preset conditions change. In this paper, we propose a method using novel Q learning method with residual neural network to train robotics arms. Comparing to traditional models, this novel method results in a better performance for robotic arms after training. The environment is abstracted into the fetch and place problem. The agent we trained could make a policy to fetch various objects, with an accuracy of 92.37%. As it takes only about an average of 32 consecutive commands to complete a task, it is more efficient in execution than any other agents trained only by the usual reinforcement method.
Robotic Arm, Reinforcement Learning, Deep Learning
Xiao Huang. Fetching Policy of Intelligent Robotic Arm Based on Multiple-agents Reinforcement Learning Method. Academic Journal of Engineering and Technology Science (2021) Vol. 4, Issue 3: 52-57. https://doi.org/10.25236/AJETS.2021.040306.
 Kaelbling, L. P., Littman, M. L., Moore, A. W., 199s6. Reinforcement learning: A survey. Journal of artiﬁcial intelligence research 4, 237–285.
 Kober, J., Bagnell, J. A., Peters, J., 2013. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32 (11), 1238–1274.
 Vien, N. A., Ertel, W., Dang, V.-H., Chung, T., 2013. Monte-carlo tree search for bayesian reinforcement learning. Applied intelligence 39 (2), 345–353.
 Sutton, R. S., Barto, A. G., 1998. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.
 Mattner, J., Lange, S., Riedmiller, M., 2012. Learn to swing up and balance a real pole based on raw visual input data. In: International Conference on Neural Information Processing. Springer, pp. 126–133.