• ๋Œ€ํ•œ์ „๊ธฐํ•™ํšŒ
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ๋‹จ์ฒด์ด์—ฐํ•ฉํšŒ
  • ํ•œ๊ตญํ•™์ˆ ์ง€์ธ์šฉ์ƒ‰์ธ
  • Scopus
  • crossref
  • orcid

  1. (Dept. of Electrical and Computer Engineering, Inha University, Incheon, Republic of Korea.)



Reinforcement learning, Pendubot, Sim-to-Real Learning, LW-RCP

1. ์„œ ๋ก 

2016๋…„ ์ด๋ฃจ์–ด์ง„ ์•ŒํŒŒ๊ณ ์™€ ์ด์„ธ๋Œ ๊ฐ„์˜ ๋ฐ”๋‘‘ ๋Œ€๊ตญ์€ 4์ฐจ ์‚ฐ์—…ํ˜๋ช… ์‹œ๋Œ€์˜ ์‹œ์ž‘์„ ์‹œ์‚ฌํ•˜๋Š” ์ค‘๋Œ€ํ•œ ์ด์ •ํ‘œ๋กœ ํ‰๊ฐ€๋œ๋‹ค. ์ด ์‚ฌ๊ฑด์€ ์ธ๊ณต์ง€๋Šฅ ๊ธฐ์ˆ ์ด ๋‹จ์ˆœํ•œ ํ•™๋ฌธ์  ์—ฐ๊ตฌ์˜ ์˜์—ญ์„ ๋„˜์–ด, ์‚ฌํšŒ ์ „๋ฐ˜์— ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ํ˜„์‹ค์  ์‘์šฉ ๊ธฐ์ˆ ๋กœ ์„ฑ์žฅํ–ˆ์Œ์„ ์ฆ๋ช…ํ•˜์˜€๋‹ค[1]. ์ด๋Š” ์ธ๊ณต์ง€๋Šฅ์ด ์ธ๊ฐ„์˜ ํ–‰๋™๊ณผ ๊ฒฐ์ • ๊ณผ์ •์„ ๋ชจ๋ฐฉํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”์—ˆ๊ณ , ๋‚˜์•„๊ฐ€ ์ธ๊ฐ„๋ณด๋‹ค ๋” ํšจ์œจ์ ์œผ๋กœ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ์ค‘์š”ํ•œ ์‹œ์‚ฌ์ ์„ ์ œ๊ณตํ•œ๋‹ค. ํŠนํžˆ, ์•ŒํŒŒ๊ณ ์˜ ์Šน๋ฆฌ๋Š” ๊ฐ•ํ™”ํ•™์Šต(reinforcement learning)์ด๋ผ๋Š” ๊ณ ๋„ํ™”๋œ ๊ธฐ๊ณ„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์˜ ํšจ์šฉ์„ฑ์„ ์ „ ์„ธ๊ณ„์— ์ž…์ฆํ•œ ์‚ฌ๋ก€๋กœ์„œ ์ฃผ๋ชฉํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋•Œ โ€˜๊ฐ•ํ™”ํ•™์Šตโ€™์ด๋ž€, ์ธ๊ณต์ง€๋Šฅ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋™์ ์ธ ํ™˜๊ฒฝ๊ณผ์˜ ์‹œํ–‰์ฐฉ์˜ค์ ์ธ ์ƒํ˜ธ์ž‘์šฉ ๊ณผ์ •์„ ํ†ตํ•ด ์Šค์Šค๋กœ ์ตœ์ ์˜ ํ–‰๋™ ํŒจํ„ด์„ ํ•™์Šตํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์˜๋ฏธํ•œ๋‹ค[2].

๊ฐ•ํ™”ํ•™์Šต์€ ์ œ์–ด๊ณตํ•™ ๋ถ„์•ผ์—์„œ๋„ ๊ทธ ํšจ์šฉ์„ฑ์„ ์ธ์ •๋ฐ›์•„ ์ž์œจ์ฃผํ–‰๊ณผ ๋กœ๋ณดํ‹ฑ์Šค ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ๊ฐ•ํ™”ํ•™์Šต์„ ์‹ค์ œ ์ž์œจ์ฃผํ–‰ ์‹œ์Šคํ…œ์— ์„ฑ๊ณต์ ์œผ๋กœ ์ ์šฉํ•˜์˜€์œผ๋ฉฐ[3], ์‚ฌ์กฑ ๋ณดํ–‰ ๋กœ๋ด‡์˜ ๊ฐ•๊ฑดํ•œ ์ž๊ฐ ๋ณดํ–‰ ๊ตฌํ˜„์—๋„ ์„ฑ๊ณตํ•˜๋Š” ๋“ฑ ์ƒˆ๋กœ์šด ์ œ์–ด ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•˜๊ณ  ์žˆ๋‹ค[4]. ์ด์ฒ˜๋Ÿผ ๊ฐ•ํ™”ํ•™์Šต์€ ๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ์˜ ์ž์œจ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๋ฉฐ, ๋‹ค์–‘ํ•œ ์ œ์–ด ์‹œ์Šคํ…œ์—์„œ ๊ทธ ๊ฐ€๋Šฅ์„ฑ์„ ์ž…์ฆํ•˜๊ณ  ์žˆ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต๊ณผ ์ œ์–ด๊ณตํ•™์˜ ์œตํ•ฉ์ ์ธ ์—ฐ๊ตฌ๊ฐ€ ๋‘๋“œ๋Ÿฌ์ง„ ๋ฐœ์ „์„ ๋ณด์ด๋Š” ๊ฐ€์šด๋ฐ, ์ด์— ๋ฐœ๋งž์ถ”์–ด ๋งŽ์€ ๊ต์œก๊ธฐ๊ด€๋“ค์ด โ€˜๊ฐ•ํ™”ํ•™์Šตโ€™ ํ˜น์€ ์ œ์–ด๊ณตํ•™์— ์ง€๋Šฅํ˜• ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•œ๋‹ค๋Š” ์˜๋ฏธ์—์„œ โ€˜์ง€๋Šฅ์ œ์–ดโ€™ ๋“ฑ์˜ ์ด๋ฆ„์œผ๋กœ ์ƒˆ๋กœ์šด ๊ต์œก๊ณผ์ •์„ ๊ฐœ๋ฐœ ๋ฐ ์šด์˜ ์ค‘์— ์žˆ๋‹ค[5-7]. ํ•˜์ง€๋งŒ, ๋Œ€๋ถ€๋ถ„์˜ ๊ต์œก์€ ์ด๋ก ๊ณผ ์ปดํ“จํ„ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ์ค‘์ ์„ ๋‘๊ณ  ์žˆ๊ณ , ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ๊ฐ•ํ™”ํ•™์Šต์„ ์ ์šฉํ•˜๋Š” ๊ฒƒ์€ ๊ธฐ์ˆ ์  ํ•œ๊ณ„๋กœ ์ธํ•œ ์–ด๋ ค์›€์„ ๊ฒช๊ณ  ์žˆ๋‹ค. ์‹ค๋ฌผ ์‹œ์Šคํ…œ์„ ๊ตฌ๋™ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ  ์ œ์–ด ์‹ ํ˜ธ๋ฅผ ์ „๋‹ฌํ•ด์•ผ ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ œ์–ด ์‹ ํ˜ธ๋ฅผ ์—ฐ์‚ฐํ•˜๋Š” ๊ฐ•ํ™”ํ•™์Šต ์‹ ๊ฒฝ๋ง์„ ๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ์— ๊ตฌํ˜„ํ•ด์•ผ ํ•˜์ง€๋งŒ, ๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ๋Š” ์„ฑ๋Šฅ๊ณผ ์ž์›์ด ์ œํ•œ์ ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ์—๋Š” ๊ธฐ์ˆ ์  ํ•œ๊ณ„๊ฐ€ ์กด์žฌํ•œ๋‹ค.

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ๊ฐ€์ง€์น˜๊ธฐ(Pruning), ์–‘์žํ™”(Quantization), ์ง€์‹ ์ฆ๋ฅ˜(Knowledge Distillation)์™€ ๊ฐ™์€ ์ธ๊ณต ์‹ ๊ฒฝ๋ง ๊ฒฝ๋Ÿ‰ํ™” ๊ธฐ์ˆ ๋“ค์ด ์š”๊ตฌ๋œ๋‹ค[8]. ํ•˜์ง€๋งŒ, ์ด๋Ÿฌํ•œ ๊ธฐ์ˆ ๋“ค์€ ๊ฐ•ํ™”ํ•™์Šต์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ํ•™์ƒ๋“ค์—๊ฒŒ๋Š” ๋„ˆ๋ฌด ๋†’์€ ์ง„์ž…์žฅ๋ฒฝ์ด ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ต์œก์˜ ์›๋ž˜ ๋ชฉํ‘œ์—์„œ ๋ฒ—์–ด๋‚˜๊ฒŒ ๋  ๊ฐ€๋Šฅ์„ฑ์ด ์กด์žฌํ•œ๋‹ค. ์‹ค์ œ๋กœ ๊ต์œก์˜ ๋ณธ๋ž˜ ๋ชฉํ‘œ๋Š” ํ•™์ƒ๋“ค์ด ์ถ”๊ฐ€์ ์ธ ์ „๋ฌธ ์ง€์‹ ์—†์ด๋„ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ธฐ๋ณธ ์›๋ฆฌ๋ฅผ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ํ‚ค์šฐ๋Š” ๊ฒƒ์ด๋‹ค.

์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Python์œผ๋กœ ํ•™์Šต๋œ ๊ฐ•ํ™”ํ•™์Šต ์‹ ๊ฒฝ๋ง์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•๋Š” ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด ๊ต์œก ํ”Œ๋žซํผ์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ํ”Œ๋žซํผ์€ Sim-to-Real ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ํ•™์Šต๋œ ์‹ ๊ฒฝ๋ง์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค. Sim-to-Real ๊ธฐ๋ฒ•์€ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๊ฐ€ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ํ™˜๊ฒฝ์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์œผ๋กœ ๊ตฌ์ถ•ํ•˜์—ฌ ์‹ ๊ฒฝ๋ง์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์˜๋ฏธํ•œ๋‹ค[9].

์ด๋ ‡๊ฒŒ ํ•™์Šต๋œ ์‹ ๊ฒฝ๋ง์€ Matlab/Simulink ํ™˜๊ฒฝ์— ํ˜ธํ™˜ ๊ฐ€๋Šฅํ•œ ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ™˜๋˜์–ด ์ œ์–ด๊ธฐ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ๋‹ค. Matlab/Simulink์—์„œ ๊ตฌํ˜„๋œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” ์ œ์–ด๋Ÿ‰์„ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋˜๋ฉฐ, ์ด ์ œ์–ด๋Ÿ‰์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ๊ตฌ๋™๊ธฐ๋กœ ์ „๋‹ฌํ•˜๊ณ  ์„ผ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ์—ญํ• ์€ ๋ณธ ๋…ผ๋ฌธ์˜ ์ €์ž๋“ค์ด ์†ํ•œ ์—ฐ๊ตฌ์‹ค์—์„œ ์ง์ ‘ ๊ฐœ๋ฐœํ•œ LW-RCP(Light-Weight Rapid Control Prototyping)๊ฐ€ ๋‹ด๋‹นํ•œ๋‹ค. LW-RCP๋Š” ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ๋ณด๋‹ค ๋น ๋ฅธ ์—ฐ์‚ฐ ์†๋„์™€ ํ’๋ถ€ํ•œ ์ž์›์„ ๊ฐ–์ถ˜ PC์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋„์™€์ฃผ๋ฉฐ, ์ธ๊ณต ์‹ ๊ฒฝ๋ง ๊ฒฝ๋Ÿ‰ํ™”์™€ ๊ฐ™์€ ๊ธฐ์ˆ  ์—†์ด๋„ ๊ธฐ์กด์˜ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜ ๊ต์œก ๋ฐฉ์‹์„ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์€ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜๋Š”๋ฐ ๋ฐœ์ƒํ•˜๋Š” ์ง„์ž…์žฅ๋ฒฝ์„ ํšจ๊ณผ์ ์œผ๋กœ ๋‚ฎ์ถœ ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ด๋ก  ์ค‘์‹ฌ ๊ต์œก๊ณผ์ •์—์„œ ๋ฒ—์–ด๋‚˜, ํ•™์ƒ๋“ค์ด ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•ด๋ณด๋Š” ์‹ค์Šต ์ค‘์‹ฌ ํ•™์Šต์˜ ํšจ๊ณผ๋ฅผ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋ณธ ๋…ผ๋ฌธ์€ ์ œ์•ˆ๋œ ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด, pendubot ์‹œ์Šคํ…œ์„ ํ™œ์šฉํ•œ ์ œ์–ด ์‹คํ—˜์„ ์ง„ํ–‰ํ•œ๋‹ค. Pendubot์€ ๋‘ ๊ฐœ์˜ ๋งํฌ๋กœ ๊ตฌ์„ฑ๋œ ๋ถ€์กฑ ๊ตฌ๋™ ์‹œ์Šคํ…œ์œผ๋กœ, ๋‹ค์–‘ํ•œ ์ œ์–ด ์ด๋ก ์„ ์ ์šฉํ•˜๋Š” ํ…Œ์ŠคํŠธ๋ฒ ๋“œ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์œผ๋ฉฐ, ๊ฐ•ํ™”ํ•™์Šต ์—ฐ๊ตฌ์—๋„ ํ™œ์šฉ๋œ ๋ฐ” ์žˆ๋‹ค[10,11]. ํŠนํžˆ, pendubot์˜ ๋น„์„ ํ˜• ๋ชจ๋ธ ๋ฐฉ์ •์‹์€ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ์— ์ ํ•ฉํ•œ ๋„์ „๊ณผ์ œ๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ์ด์œ ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ œ์•ˆ๋œ ํ”Œ๋žซํผ์˜ ์‹คํšจ์„ฑ์„ ๊ฒ€์ฆํ•˜๋Š” ๋ฐ pendubot์„ ์‚ฌ์šฉํ•œ๋‹ค.

๋ณธ ๋…ผ๋ฌธ์˜ ์ดํ›„ ๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 2์žฅ์—์„œ๋Š” Python๊ณผ Matlab/Simulink, LW-RCP๊ฐ€ ๊ฒฐํ•ฉ๋œ ํ”Œ๋žซํผ์˜ ๊ตฌ์กฐ๋ฅผ ์„œ์ˆ ํ•œ๋‹ค. ์ด์–ด์ง€๋Š” 3์žฅ์—์„œ๋Š” Sim-to-Real ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜๊ธฐ ์œ„ํ•œ pendubot์˜ ๋ชจ๋ธ ๋ฐฉ์ •์‹์„ ์œ ๋„ํ•˜๊ณ , pendubot์˜ ๊ท ํ˜•์ ์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค. ์ดํ›„ 4์žฅ์—์„œ๋Š” ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ์„ค๊ณ„ํ•˜๊ณ  pendubot ์ œ์–ด ์‹คํ—˜์„ ํ†ตํ•ด ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ณ , ๋งˆ์ง€๋ง‰์œผ๋กœ 5์žฅ์—์„œ ๊ฒฐ๋ก ์„ ๋งบ๋Š”๋‹ค.

2. ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ๊ตฌ์กฐ

2.1 Sim-to-Real ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ

๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์—๋Š” ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๊ฐ€ ํ•ต์‹ฌ์ ์ธ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” ๊ณ ์ „์  ์ œ์–ด ๋ฐฉ์‹์—์„œ ์ œ์–ด ์—ฐ์‚ฐ์„ ๋‹ด๋‹นํ•˜๋Š” ์ œ์–ด๊ธฐ์˜ ์—ญํ• ์„ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋กœ ๋Œ€์ฒดํ•œ ํ˜•ํƒœ๋ฅผ ์˜๋ฏธํ•œ๋‹ค[12].

๊ฐ•ํ™”ํ•™์Šต์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ฐœ๋…์ธ ์—์ด์ „ํŠธ๋ž€ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ตฌํ˜„ํ•œ ์‹œ์Šคํ…œ์„ ๋งํ•œ๋‹ค. ์—์ด์ „ํŠธ๋Š” ํŠน์ • ๋ชฉ์ ์„ ๊ฐ–๊ณ  ์ฃผ์–ด์ง„ ํ™˜๊ฒฝ๊ณผ ์ƒํ˜ธ์ž‘์šฉํ•˜๋ฉฐ ์ž์‹ ์˜ ํ–‰๋™ ์ •์ฑ…์—์„œ ๋น„๋กฏํ•œ ๋™์ž‘์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋กœ ์–ป์–ด์ง€๋Š” ๋ณด์ƒ์„ ํ†ตํ•ด ํ–‰๋™ ์ •์ฑ…์„ ๊ฐœ์„ ํ•˜๋Š” ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•œ๋‹ค[13]. ์ด๋ ‡๊ฒŒ ์™„์„ฑ๋œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” ์ฃผ์–ด์ง„ ํ™˜๊ฒฝ์—์„œ ์ทจ๋“ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ํ•™์Šต๋œ ํ–‰๋™ ์ •์ฑ…์— ๋”ฐ๋ฅธ ์ตœ์ ์˜ ํ–‰๋™์ธ ์ œ์–ด๋Ÿ‰์„ ์ถœ๋ ฅํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ, ์—์ด์ „ํŠธ์™€ ์ƒํ˜ธ์ž‘์šฉ์ด ์ด๋ฃจ์–ด์ง€๋Š” ํ™˜๊ฒฝ์€ ์ฃผ๋กœ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์„ ํ™œ์šฉํ•œ ๋ฌผ๋ฆฌ์  ํ™˜๊ฒฝ, ํ˜น์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜์˜ ๊ฐ€์ƒํ™˜๊ฒฝ ๋‘ ๊ฐ€์ง€๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

๋จผ์ €, ์‹ค๋ฌผ ์‹œ์Šคํ…œ๊ณผ ์ง์ ‘ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ์ •ํ™•ํ•œ ๋ชจ๋ธ์ด ํ•„์š” ์—†๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ํ•ด๋‹น ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ, ํ•™์Šต์— ์š”๊ตฌ๋˜๋Š” ๋ฌผ๋ฆฌ์ ์ธ ์‹œ๊ฐ„์ด ํ•„์š”ํ•˜๋ฉฐ, ํ•™์Šต ๊ณผ์ • ์ค‘ ๋ฐœ์ƒํ•˜๋Š” ์‹œํ–‰์ฐฉ์˜ค๋กœ ์ธํ•ด ์•ˆ์ „์‚ฌ๊ณ ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค[14]. ์ด๋Š” ์ œํ•œ๋œ ์‹œ๊ฐ„ ๋‚ด์— ์•ˆ์ „ํ•˜๊ฒŒ ๊ต์œก์„ ์ง„ํ–‰ํ•ด์•ผ ํ•˜๋Š” ์‹ค์Šต ์ค‘์‹ฌ์˜ ๊ต์œก๊ณผ์ • ํŠน์„ฑ์ƒ ํ•ด๋‹น ๋ฐฉ์‹์€ ๋„์ž…๋˜๊ธฐ ์–ด๋ ต๋‹ค.

๋”ฐ๋ผ์„œ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์—์ด์ „ํŠธ๊ฐ€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ๊ณผ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ๋ฐฉ์‹์ธ Sim-to-Real ํ•™์Šต ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ ์ž ํ•œ๋‹ค. ํ•ด๋‹น ๊ธฐ๋ฒ•์€ ์—์ด์ „ํŠธ์™€ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ํ™˜๊ฒฝ์ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์ด ๋˜๋ฏ€๋กœ ํ•™์Šต์— ๊ฑธ๋ฆฌ๋Š” ์‹œ๊ฐ„์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ์ด๋กœ ์ธํ•ด, ์ œํ•œ๋œ ๊ต์œก ํ™˜๊ฒฝ์—์„œ ์ œ์–ด๊ธฐ ์„ค๊ณ„์— ๋“œ๋Š” ์‹œ๊ฐ„์  ๋น„์šฉ์„ ํฌ๊ฒŒ ์ค„์—ฌ ๋”์šฑ ๋‚ด์‹ค ์žˆ๋Š” ๊ต์œก ๊ณผ์ •์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. ๋˜ํ•œ, ํ•™์Šต ๊ณผ์ • ์ค‘ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋“  ์‹œํ–‰์ฐฉ์˜ค๋“ค์ด ์ปดํ“จํ„ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ๋ฐœ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์•ˆ์ „ํ•˜๊ฒŒ ์—์ด์ „ํŠธ ํ•™์Šต์„ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ œํ•œ๋œ ๊ต์œก ์‹œ๊ฐ„ ๋‚ด์— ์•ˆ์ „ํ•˜๊ฒŒ ์—์ด์ „ํŠธ์˜ ํ•™์Šต์„ ๋งˆ์น  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ค€๋‹ค.

๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ํ•ต์‹ฌ ์š”์†Œ ์ค‘ ํ•˜๋‚˜์ธ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋ฅผ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” Python์˜ ์‚ฌ์šฉ์ด ์š”๊ตฌ๋œ๋‹ค. ์ด๋Š” ๊ฐ•ํ™”ํ•™์Šต ๋ถ„์•ผ๋ฅผ ํฌํ•จํ•œ ์ธ๊ณต์ง€๋Šฅ ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” ๋Œ€ํ‘œ์  ํ”„๋ ˆ์ž„์›Œํฌ์ธ PyTorch[15], Tensorflow[16]๋“ฑ์ด Python์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

2.2 LW-RCP

Sim-to-Real ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ์™„์„ฑ๋œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ์˜ ๋™์ž‘์— ํ•„์š”ํ•œ ์„ผ์„œ ๋ฐ์ดํ„ฐ๋Š” LW-RCP๋ฅผ ํ†ตํ•ด ํš๋“ํ•œ๋‹ค. RCP ์‹œ์Šคํ…œ์€ ์ œ์–ด ์‹œ์Šคํ…œ ์—”์ง€๋‹ˆ์–ด๋“ค์ด ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์œผ๋กœ ์„ค๊ณ„ ๋ฐ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋Š” ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์„ ์˜๋ฏธํ•œ๋‹ค[17,18].

ํ•ด๋‹น ์‹œ์Šคํ…œ์€ hardware interface๋ฅผ ๋‹ด๋‹นํ•˜๋Š” ์žฅ์น˜์™€ Simulink์—์„œ ์‚ฌ์šฉํ•˜๋Š” library block์œผ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, ๊ทธ๋ฆผ 1์€ LW-RCP hardware interface ์žฅ์น˜์˜ ์‚ฌ์ง„์ด๋‹ค. LW-RCP์˜ hardware interface๋ฅผ ํ†ตํ•ด ์„ผ์„œ์—์„œ ๊ด€์ธกํ•œ ๋ฐ์ดํ„ฐ๋ฅผ PC๋กœ ์ „์†กํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, PC์—์„œ ๊ตฌ๋™๋˜๋Š” Simulink ์ œ์–ด ๋ชจ๋ธ์˜ ์—ฐ์‚ฐ ๊ฒฐ๊ณผ์ธ ์ œ์–ด๋Ÿ‰์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ๊ตฌ๋™๋ถ€์— ์ธ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•ด๋‹น ๊ณผ์ •์€ high-speed USBํ†ต์‹ ์œผ๋กœ ์ด๋ฃจ์–ด์ง€๋ฉฐ, ์ตœ๋Œ€ 2kHz์˜ ์ƒ˜ํ”Œ๋ง ์ฃผํŒŒ์ˆ˜๋ฅผ ๋ณด์ธ๋‹ค.

๊ต์œก์„ ๋“ฃ๋Š” ํ•™์ƒ๋“ค์€ LW-RCP library์—์„œ ์ œ๊ณตํ•˜๋Š” block์„ ์ด์šฉํ•˜์—ฌ ํ•˜๋“œ์›จ์–ด ์ ‘๊ทผ์„ ๋‹ด๋‹นํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ตฌ์„ฑํ•˜๊ณ , Simulink์˜ โ€˜matlab functionโ€™ block์œผ๋กœ ๊ตฌํ˜„๋œ ์‹ ๊ฒฝ๋ง์œผ๋กœ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ๊ตฌ์„ฑํ•œ๋‹ค. ์„ผ์„œ ๋ฐ์ดํ„ฐ ์ธก์ •๊ณผ ์ œ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ๋™๋ถ€์— ์ ์šฉํ•˜๋Š” hardware interface๊ฐ€ ํ•„์š”ํ•œ ๋ถ€๋ถ„์€ LW-RCP๊ฐ€ ๋‹ด๋‹นํ•˜๊ณ , Simulink ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ ๋ชจ๋ธ์˜ ์—ฐ์‚ฐ์€ PC๊ฐ€ ๋‹ด๋‹นํ•˜๋Š” ๊ตฌ์กฐ๋‹ค. LW-RCP์™€ PC์˜ ๋™์ž‘ ์›๋ฆฌ๋Š” ๊ทธ๋ฆผ 2์™€ ๊ฐ™๋‹ค.

๊ทธ๋ฆผ 1. LW-RCP02 hardware ์žฅ์น˜

Fig. 1. LW-RCP02 hardware unit

../../Resources/kiee/KIEE.2025.74.1.118/fig1.png

๊ทธ๋ฆผ 2. LW-RCP์˜ ๋™์ž‘ ๋ฐฉ์‹

Fig. 2. Flow chart of LW-RCP operation

../../Resources/kiee/KIEE.2025.74.1.118/fig2.png

๊ทธ๋ฆผ 3. LW-RCP02 ์ž…์ถœ๋ ฅ library block

Fig. 3. Input/Output library block of LW-RCP02

../../Resources/kiee/KIEE.2025.74.1.118/fig3.png

๊ทธ๋ฆผ 3์€ RCP๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์ž…์ถœ๋ ฅ library block์„ ๋ณด์—ฌ์ค€๋‹ค. ํ•ด๋‹น block๋“ค์€ ๊ฐ๊ฐ ์ž…์ถœ๋ ฅ ๊ธฐ๋Šฅ์„ ๋‹ด๋‹นํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ํ•™์ƒ๋“ค์€ ์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํ•˜๋“œ์›จ์–ด ์ ‘๊ทผ์— ํ•„์š”ํ•œ ๊ธฐ๋Šฅ์„ ์†์‰ฝ๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. ์ขŒ์ธก ์—ด์˜ block๋“ค์€ receive block์œผ๋กœ, hardware interface๋กœ๋ถ€ํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์‹ ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๊ฐ–๊ณ  ์žˆ๋‹ค. ์šฐ์ธก ์—ด์— ์œ„์น˜ํ•œ block๋“ค์€ hardware interface๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‚ด๊ธฐ ์œ„ํ•œ Send block๋“ค์ด๋‹ค. ๋”ฐ๋ผ์„œ LW-RCP๋ฅผ ์ด์šฉํ•œ ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ Simulink์˜ block์„ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„๋˜๋ฉฐ, ํ•™์ƒ๋“ค์ด ๋งˆ์ดํฌ๋กœ์ปจํŠธ๋กค๋Ÿฌ๋ฅผ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ code๋ฅผ ๋งŒ๋“ค๊ณ  ์ด๋ฅผ debugging ํ•˜๋Š” ๊ณผ์ •์„ ์—†์•  ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ตฌ์„ฑ์—๋งŒ ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ๋Š” ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•œ๋‹ค.

2.3 ํ”Œ๋žซํผ ๊ตฌ์กฐ

์ œ์•ˆํ•˜๋Š” ๊ต์œก ํ”Œ๋žซํผ์€ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ๋ฐ์ดํ„ฐ ์ทจ๋“ ๋ฐ ์ „์†ก, ์ˆ˜์‹  ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์ œ์–ด๋Ÿ‰ ์—ฐ์‚ฐ, ๊ทธ๋ฆฌ๊ณ  ์ œ์–ด ์‹ ํ˜ธ์˜ ์ž…์ถœ๋ ฅ์„ ํฌํ•จํ•˜๋Š” ์ผ๋ จ์˜ ๊ณผ์ •์„ ๊ตฌํ˜„ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ณผ์ •์€ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์—์ด์ „ํŠธ์˜ ์ƒ์„ฑ ๋ฐ ํ™œ์šฉ์„ ๋ชฉํ‘œ๋กœ ํ•˜๋ฉฐ, ๊ทธ๋ฆผ 4์˜ ๊ฐœ๋…๋„์— ๋‚˜ํƒ€๋‚œ 3๊ฐ€์ง€ ์‹œ์Šคํ…œ์˜ ๊ฒฐํ•ฉ์„ ํ†ตํ•ด ์ด๋ฃจ์–ด์ง„๋‹ค.

๊ทธ๋ฆผ 4์˜ ๊ตฌ์„ฑ๋„์— ๋”ฐ๋ฅด๋ฉด, LW-RCP๋Š” ์„ผ์„œ๋ฅผ ํ†ตํ•ด ๊ด€์ธกํ•œ ๋ฐ์ดํ„ฐ๋ฅผ Matlab/Simulink ํ™˜๊ฒฝ์— ์ „๋‹ฌํ•˜๊ณ , ๋™์‹œ์— ์ œ์–ด ์ž…๋ ฅ์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ๊ตฌ๋™๊ธฐ์— ์ธ๊ฐ€ํ•œ๋‹ค. Matlab/Simulink๋Š” ์ค‘๊ฐ„์— ์œ„์น˜ํ•˜์—ฌ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ์‹ค์‹œ๊ฐ„ ์ œ์–ด ์‹œ์Šคํ…œ์˜ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. Python์—์„œ๋Š” ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ์˜ ํ•™์Šต์ด ์ด๋ฃจ์–ด์ง€๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์ด ๊ตฌํ˜„๋˜์–ด ์žˆ์œผ๋ฉฐ, ํ•™์Šต ์™„๋ฃŒ๋œ ์‹ ๊ฒฝ๋ง์€ Matlab/Simulink์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜๋˜์–ด ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ ๊ตฌ์„ฑ์— ์‚ฌ์šฉ๋œ๋‹ค.

๊ทธ๋ฆผ 4. ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ๊ตฌ์„ฑ๋„

Fig. 4. Concept diagram of the proposed platform.

../../Resources/kiee/KIEE.2025.74.1.118/fig4.png

Python์—์„œ Sim-to-Real ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ํ•™์Šต๋œ ์‹ ๊ฒฝ๋ง๊ณผ ์‹ค๋ฌผ ์‹œ์Šคํ…œ ์ œ์–ด๊ธฐ๊ฐ€ ๊ตฌํ˜„๋œ Matlab/Simulink๊ฐ€ ํ†ตํ•ฉ๋œ ํ”Œ๋žซํผ์„ ๊ตฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด MATLAB์—์„œ ์ œ๊ณตํ•˜๋Š” โ€˜matlab.engineโ€™๊ณผ Python API๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ํ•ด๋‹น ๊ธฐ๋Šฅ์€ Matlab์˜ ๊ธฐ๋Šฅ์„ Python์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก Matlab์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ธฐ๋Šฅ์œผ๋กœ ํ•ด๋‹น API๋ฅผ ํ†ตํ•ด Python์ฝ”๋“œ ๋‚ด๋ถ€์—์„œ Matlab ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ทธ ๋ฐ˜๋Œ€์˜ ๊ฒฝ์šฐ๋„ ๊ฐ€๋Šฅํ•˜๋‹ค. ๋˜ํ•œ, Python๊ณผ Matlab๊ฐ„์˜ ์ž‘์—… ๊ณต๊ฐ„์— ์„œ๋กœ ์ ‘๊ทผํ•˜์—ฌ ์ €์žฅ๋œ ๋ณ€์ˆ˜๋“ค์„ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ๋”์šฑ ํšจ์œจ์ ์ธ ๊ฐœ๋ฐœ์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” Python ํ™˜๊ฒฝ์—์„œ Sim-to-Real ๊ธฐ๋ฒ•์œผ๋กœ ํ•™์Šต๋œ ์‹ ๊ฒฝ๋ง์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์ธ ๊ฐ€์ค‘์น˜ ๊ฐ’๊ณผ ํŽธํ–ฅ ๊ฐ’์„ Matlab/Simulink ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉ๊ฐ€๋Šฅํ•œ ํŒŒ์ผ๋กœ ์ €์žฅํ•œ๋‹ค. ์ดํ›„ ์ €์žฅ๋œ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์€ Matlab/Simulink ํ™˜๊ฒฝ์—์„œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ์š”์†Œ๋กœ ์‚ฌ์šฉ๋œ๋‹ค[12].

์ƒ๊ธฐ๋œ ์š”์†Œ๋“ค์„ ๊ฒฐํ•ฉํ•˜์—ฌ Python๊ณผ Matlab/Simulink๊ฐ€ ๊ฒฐํ•ฉ๋œ ํ˜•ํƒœ์˜ ํ”Œ๋žซํผ์˜ ์ง„ํ–‰ ์ˆœ์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ๋จผ์ €, Python์—์„œ ์ƒ์„ฑ๋œ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๊ฒŒ ๋œ๋‹ค. ํ•™์Šต ๊ณผ์ •์€ ์‚ฌ์šฉ์ž๊ฐ€ ์„ค์ •ํ•œ ์—ํ”ผ์†Œ๋“œ์˜ ์ข…๋‹จ ์‹œ๊ฐ„์„ ๋„˜๊ฑฐ๋‚˜, ๋ฏธ๋ฆฌ ์„ค์ •ํ•œ ํŠน์ • ์ข…๋ฃŒ ์กฐ๊ฑด์— ๋ถ€ํ•ฉํ•˜๋Š” ์ƒํ™ฉ์ด ๋ฐœ์ƒํ•œ ๊ฒฝ์šฐ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์ข…๋ฃŒํ•œ๋‹ค. ํ•™์Šต์ด ์™„๋ฃŒ๋œ ์—์ด์ „ํŠธ๋Š” ์ž์‹ ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์„ Python API๋ฅผ ํ†ตํ•ด Matlab์˜ ์ž‘์—… ๊ณต๊ฐ„์— ์ „๋‹ฌํ•œ๋‹ค. ์ดํ›„ Simulink๋กœ ์ œ์–ด ์‹œ์Šคํ…œ ๋ชจ๋ธ ํŒŒ์ผ์— ์ „๋‹ฌ๋ฐ›์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์„ ๋งค๊ฐœ ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉํ•œ ์‹ฌ์ธต์‹ ๊ฒฝ๋ง ๋ธ”๋ก์„ ๊ตฌ์„ฑํ•œ๋‹ค. ์‹ฌ์ธต์‹ ๊ฒฝ๋ง์œผ๋กœ ๊ตฌํ˜„๋œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” ์„ผ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ œ์–ด ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ทธ ์ถœ๋ ฅ์œผ๋กœ ์ œ์–ด๋Ÿ‰์„ ์ถœ๋ ฅํ•˜๊ฒŒ ๋œ๋‹ค. ์ถœ๋ ฅ๋œ ์ œ์–ด๋Ÿ‰์€ LW-RCP๋ฅผ ํ†ตํ•ด ์‹ค๋ฌผ ์‹œ์Šคํ…œ์˜ ๊ตฌ๋™๊ธฐ์— ์ „๋‹ฌ๋˜์–ด ์‹ค๋ฌผ ์‹œ์Šคํ…œ ์ œ์–ด๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ๋œ๋‹ค.

3. Pendubot์˜ ๋ชจ๋ธ ๋ฐฉ์ •์‹๊ณผ ๊ท ํ˜•์ 

3.1 Pendubot์˜ ๋ชจ๋ธ ๋ฐฉ์ •์‹ ์œ ๋„

2์žฅ์—์„œ ์–ธ๊ธ‰๋œ Sim-to-Real ํ•™์Šต ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉ๋  pendubot์˜ ๋ชจ๋ธ ๋ฐฉ์ •์‹์ด ์š”๊ตฌ๋œ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ณธ ์ ˆ์—์„œ๋Š” pendubot์˜ ์ˆ˜ํ•™์  ๋ชจ๋ธ ๋ฐฉ์ •์‹์„ ์œ ๋„ํ•œ๋‹ค. ๊ทธ๋ฆผ 5๋Š” pendubot์˜ ๊ธฐ๊ตฌ์  ๊ฐœ๋…๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ๊ทธ๋ฆผ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๋ณ€์ˆ˜๋“ค์€ SI ๋‹จ์œ„๊ณ„๋ฅผ ๋”ฐ๋ฅด๊ณ  ์žˆ์œผ๋ฉฐ, ๊ทธ๋ฆผ 5์— ๊ธฐ์ˆ ๋œ ๋ณ€์ˆ˜๋“ค์˜ ์„ธ๋ถ€์  ์˜๋ฏธ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

๊ทธ๋ฆผ 5. Pendubot์˜ ๊ธฐ๊ตฌ์  ๊ฐœ๋…๋„

Fig. 5. Mechanical conceptual diagram of pendubot

../../Resources/kiee/KIEE.2025.74.1.118/fig5.png

$T$๋Š” 1๋‹จ ๋งํฌ์— ๊ฐ€ํ•ด์ง€๋Š” ํ† ํฌ, $m_{1}$, $m_{2}$๋Š” ๊ฐ๊ฐ 1๋‹จ ๋งํฌ์™€ 2๋‹จ ๋งํฌ์˜ ์งˆ๋Ÿ‰, $l_{1},\: l_{2}$๋Š” ๊ฐ๊ฐ 1๋‹จ ๋งํฌ์™€ 2๋‹จ ๋งํฌ์˜ ํšŒ์ „์ถ•์œผ๋กœ๋ถ€ํ„ฐ ๋ฌด๊ฒŒ ์ค‘์‹ฌ๊นŒ์ง€์˜ ๊ธธ์ด๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. $\theta_{1}$๋Š” ์ง€๋ฉด์˜ ๋ฒ•์„  ๋ฐฉํ–ฅ์œผ๋กœ๋ถ€ํ„ฐ 1๋‹จ ๋งํฌ์˜ ํšŒ์ „ ๋ณ€์œ„๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, $\theta_{2}$๋Š” 2๋‹จ ๋งํฌ๊ฐ€ 1๋‹จ ๋งํฌ์™€ ์ด๋ฃจ๋Š” ์ƒ๋Œ€์ ์ธ ํšŒ์ „ ๋ณ€์œ„๋ฅผ ์˜๋ฏธํ•œ๋‹ค. $L_{1}$์€ 1๋‹จ ๋งํฌ์˜ ํšŒ์ „์ถ•๋ถ€ํ„ฐ 2๋‹จ ๋งํฌ์˜ ํšŒ์ „์ถ•๊นŒ์ง€์˜ ๊ธธ์ด๋ฅผ ์˜๋ฏธํ•œ๋‹ค. $c$๋Š” 2๋‹จ ๋งํฌ์˜ ํšŒ์ „์ถ•์— ์กด์žฌํ•˜๋Š” ๋งˆ์ฐฐ๊ณ„์ˆ˜๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, $I_{1},\: I_{2}$๋Š” 1๋‹จ ๋งํฌ 2๋‹จ ๋งํฌ์˜ ๊ด€์„ฑ ๋ชจ๋ฉ˜ํŠธ๋‹ค. $u$๋Š” 1๋‹จ ๋งํฌ์˜ ๊ฐ๊ฐ€์†๋„๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, $i,\: j,\: k$๋Š” 1๋‹จ ๋งํฌ์˜ ํšŒ์ „์ถ•์„ ์ค‘์‹ฌ์ ์œผ๋กœ ํ•œ ์ง๊ฐ ์ขŒํ‘œ๊ณ„์˜ ์ขŒํ‘œ์ถ•๋“ค์„ ์˜๋ฏธํ•œ๋‹ค.

Pendubot์˜ ๋ชจ๋ธ ๋ฐฉ์ •์‹์€ Euler-Lagrange Equation์„ ์ด์šฉํ•˜์—ฌ ์œ ๋„ํ•˜๋ฉด ์‹ (1)๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค.
(1)
$ \begin{bmatrix}m_{11}&m_{12}\\m_{21}&m_{22}\end{bmatrix}\left[\begin{aligned}\ddot{\theta_{1}}\\\ddot{\theta_{2}}\end{aligned}\right]+\left[\begin{aligned}r_{1}\\r_{2}\end{aligned}\right]=[\begin{aligned}T\\0\end{aligned}]$

์—ฌ๊ธฐ์„œ ์‹ (1)์„ ๊ตฌ์„ฑํ•˜๋Š” ์š”์†Œ๋Š” ์‹ (2)์™€ ๊ฐ™๋‹ค.

(2)
$ m_{11}= h_{1}+h_{2}+h_{3}+2h_{4}\cos(\theta_{1})\\m_{12}= h_{3}+h_{4}\cos(\theta_{2})\\m_{21}= h_{3}+h_{4}\cos(\theta_{2})\\m_{22}= h_{3}\\r_{1}= -h_{4}\sin(\theta_{1})\dot{\theta}_{2}^{2}+2\dot{\theta}_{1}\dot{\theta}_{2})-h_{5}\sin(\dot{\theta}_{1})\\-h_{6}\sin(\theta_{1}+\theta_{2})\\r_{2}= h_{4}\sin(\theta_{2})\dot{\theta}_{1}^{2}-h_{6}\sin(\theta_{1}+\theta_{2})+c\dot{\theta}_{2}$

$h_{1}$~$h_{6}$๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋˜๋ฉฐ, $g$๋Š” ์ค‘๋ ฅ๊ฐ€์†๋„ $9.81[m/s^{2}]$๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

(3)
$ h_{1}= m_{2}L_{1}^{2}\\h_{2}= m_{1}l_{1}^{2}+I_{1}\\h_{3}= m_{2}l_{2}^{2}+I_{2}\\h_{4}= m_{2}L_{1}l_{2}\\h_{5}= m_{2}L_{1}+m_{1}l_{1}\\h_{6}= m_{2}l_{2}$

์‹ (1)์„ ์žฌ๋ฐฐ์—ดํ•˜๋ฉด, ์‹ (4)์˜ ํ˜•ํƒœ๋กœ ์ •๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

(4)
$ \left[\begin{aligned}\ddot{\theta}_{1}\\\ddot{\theta}_{2}\end{aligned}\right]=-\begin{bmatrix}m_{11}&m_{12}\\m_{21}&m_{22}\end{bmatrix}^{-1}\left\{\left[\begin{aligned}r_{1}\\r_{2}\end{aligned}\right]+[\begin{aligned}T\\0\end{aligned}]\right\}$

์ด ๋•Œ, 1๋‹จ ๋งํฌ์˜ ๊ฐ๊ฐ€์†๋„ $\ddot{\theta}_{1}$์„ ์ œ์–ด์ž…๋ ฅ $u$๋กœ ํ•˜๋Š” ๊ฐ๊ฐ€์†๋„ ๋ชจ๋ธ์„ ์œ ๋„ํ•˜๋ฉด pendubot์˜ ์šด๋™๋ฐฉ์ •์‹์„ ์‹ (5)์˜ ํ˜•ํƒœ๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

(5)
$ \ddot{\theta}_{1}=u\\\ddot{\theta}_{2}=\dfrac{r_{2}-m_{21}u}{m_{22}}$

์ด ๋•Œ, ์‹ (5)๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ตฌํ•œ ์šด๋™๋ฐฉ์ •์‹์„ ํ†ตํ•ด ์ƒํƒœ๋ณ€์ˆ˜๋ฅผ ๊ฐ๊ฐ $x_{1}=\theta_{1},\: x_{2}=\theta_{2},\: x_{3}=\dot{\theta}_{1},\: x_{4}=\dot{\theta}_{2}$๋กœ ์ •์˜ํ•˜๋ฉด ์ตœ์ข…์ ์œผ๋กœ pendubot์˜ ๋ชจ๋ธ๋ฐฉ์ •์‹์€ ์‹ (6)๊ณผ ๊ฐ™์€ ๋น„์„ ํ˜• ์ƒํƒœ๊ณต๊ฐ„ ๋ฐฉ์ •์‹์œผ๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

(6)
../../Resources/kiee/KIEE.2025.74.1.118/eq6.png

Pendubot์˜ ๋ชจ๋ธ์‹์€ ๊ฐ ๋งํฌ์˜ ํšŒ์ „์ถ•์„ ์ค‘์‹ฌ์œผ๋กœ ํ•œ ํšŒ์ „์šด๋™๋งŒ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ณ  ๊ทธ ์™ธ์˜ ์ง์„  ๋ฐ ํšŒ์ „์šด๋™์€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์„ ๊ฐ€์ •ํ•œ๋‹ค. ๋ชจ๋ธ์‹์—์„œ ๊ณ ๋ คํ•œ ๋งˆ์ฐฐ๋ ฅ์€ ํšŒ์ „์†๋„์— ์„ ํ˜•์ ์ธ ๊ด€๊ณ„๋ฅผ ๊ฐ–๋Š” ํšŒ์ „ ๋งˆ์ฐฐ๋ ฅ๋งŒ์„ ๊ฐ€์ •ํ•˜๋ฉฐ, coulomb ๋งˆ์ฐฐ์€ ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š”๋‹ค.

3.2 Pendubot์˜ ๊ท ํ˜•์ 

Pendubot์€ 2๊ฐœ์˜ ๋งํฌ๋ฅผ ๊ฐ–๋Š” ํ˜•ํƒœ๋กœ, 1๋‹จ ๋งํฌ์™€ 2๋‹จ ๋งํฌ์˜ ํšŒ์ „ ๋ณ€์œ„์— ๋”ฐ๋ผ ๋‹ค์–‘ํ•œ ๊ท ํ˜•์ ์„ ๊ฐ–๋Š”๋‹ค. ๊ท ํ˜•์ ์€ 2๋‹จ ๋งํฌ๊ฐ€ ๋ฐ”๋‹ฅ์— ๋Š˜์–ด์ง„ ์•ˆ์ •์ ์ธ ์ƒํƒœ์™€ ๋„๋ฆฝ ์ƒํƒœ๊ฐ€ ์žˆ์œผ๋ฉฐ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋„๋ฆฝ ์ƒํƒœ์˜ ๊ท ํ˜•์ ๋งŒ์„ ๋‹ค๋ฃฌ๋‹ค. ๊ทธ ์ค‘ 1๋‹จ ๋งํฌ์˜ ๊ฐ๋ณ€์œ„์— ๋”ฐ๋ผ $\theta_{1}=0,\: \theta_{1}=-\dfrac{\pi}{6},\: \theta_{1}=\dfrac{\pi}{6}$์ธ ๋„๋ฆฝ ์ƒํƒœ์˜ ๊ท ํ˜•์ ๋งŒ์„ ์‹คํ—˜์—์„œ ๊ตฌํ˜„ํ•œ๋‹ค. ๊ทธ๋ฆผ 6์€ ํ•ด๋‹น ๊ท ํ˜•์ ์„ ์‹œ๊ฐํ™”ํ•œ ๊ทธ๋ฆผ์ด๋‹ค.

๊ทธ๋ฆผ 6. Pendubot ์‹œ์Šคํ…œ์˜ ๊ท ํ˜•์ 

Fig. 6. Equilibrium points of pendubot system.

../../Resources/kiee/KIEE.2025.74.1.118/fig6.png

4. ์‹คํ—˜ ๋ฐ ๊ฒฐ๊ณผ

๋ณธ ์ ˆ์—์„œ๋Š” ์•ž์„œ ์„œ์ˆ ํ•œ ๋ชจ๋ธ ๋ฐฉ์ •์‹๊ณผ pendubot ์‹œ์Šคํ…œ์„ ์ด์šฉํ•˜์—ฌ, Sim-to-Real ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ์„ค๊ณ„ํ•˜๊ณ  ์ด๋ฅผ ์‹ค์ œ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜์—ฌ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•œ๋‹ค. ๋ณธ ์ ˆ์˜ ๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ๋จผ์ € ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒ…์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•œ ์„ค์ •๊ณผ, ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ณผ์ •์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค. ์ดํ›„, ์‹คํ—˜ ํ™˜๊ฒฝ ๋ฐ ์‹คํ—˜ ๊ฒฐ๊ณผ์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค.

4.1 ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ ๋ฐ ์„ค์ •

๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๊ฐ€ ํ•™์Šต์„ ์œ„ํ•ด ์ง์ ‘ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ํ™˜๊ฒฝ์€ 3์žฅ์— ์„œ์ˆ ํ•œ ๋ชจ๋ธ ๋ฐฉ์ •์‹์„ ๋ฐ”ํƒ•์œผ๋กœ Python์ƒ์— ์‹œ๋ฎฌ๋ ˆ์ด์…˜์œผ๋กœ ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ํ•ด๋‹น ํ™˜๊ฒฝ์„ ๊ตฌ์ถ•ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ pendubot์˜ ๋ฌผ๋ฆฌ์  ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ํ‘œ 1์— ๋‚˜์—ด๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋น„์„ ํ˜• ์ƒ๋ฏธ๋ถ„ ๋ฐฉ์ •์‹์˜ ํ•ด๋ฅผ ๊ตฌํ•˜๊ธฐ ์œ„ํ•œ solver๋Š” ode4 Runge-kutta ๋ฐฉ๋ฒ•์„ ์„ ํƒํ•˜์˜€๋‹ค. ๋ณธ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” ํ•™์Šต์„ ์™„๋ฃŒํ•œ ํ›„, 2์žฅ์—์„œ ์–ธ๊ธ‰๋œ โ€˜matlab.engineโ€™๊ณผ Python API๋ฅผ ํ™œ์šฉํ•˜์—ฌ Matlab/Simulink์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜๋˜์–ด ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ์— ์‚ฌ์šฉ๋œ๋‹ค.

์‹œ๋ฎฌ๋ ˆ์ด์…˜์œผ๋กœ ๊ตฌ๋™๋˜๋Š” ํ•™์Šต ํ™˜๊ฒฝ์—์„œ ํ•œ ์—ํ”ผ์†Œ๋“œ์˜ ๊ธธ์ด๋Š” 10์ดˆ๋กœ ์„ค์ •๋˜์—ˆ๊ณ , ์‹œ๋ฎฌ๋ ˆ์ด์…˜์€ 1ms ์ฃผ๊ธฐ๋กœ ์—…๋ฐ์ดํŠธ๊ฐ€ ์ด๋ฃจ์–ด์ง€๋ฉฐ, ์—์ด์ „ํŠธ๋Š” 10ms ๋งˆ๋‹ค ์ƒํƒœ์ •๋ณด๋ฅผ ๊ด€์ธกํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์—์ด์ „ํŠธ๋Š” ํ•œ ์—ํ”ผ์†Œ๋“œ๋‹น ํ™˜๊ฒฝ๊ณผ ์ตœ๋Œ€ 1000๋ฒˆ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•˜๊ฒŒ ๋˜๊ณ , ๋งค ์ƒํ˜ธ์ž‘์šฉ ์‹œ์ ์˜ ๋ณด์ƒ์„ ๋ฐ”ํƒ•์œผ๋กœ ์ž์‹ ์˜ ํ–‰๋™ ์ •์ฑ…์„ ๊ฐœ์„ ํ•œ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ 1000๋ฒˆ์˜ ์ƒํ˜ธ์ž‘์šฉ์ด ์ด๋ฃจ์–ด์กŒ๊ฑฐ๋‚˜ 1๋‹จ ๋งํฌ์˜ ๊ฐ์†๋„์˜ ์ ˆ๋Œ“๊ฐ’์ธ $\left |\dot{\theta}_{1}\right |$์ด $25 rad/s$์„ ์ดˆ๊ณผํ•˜๋Š” ๊ฒฝ์šฐ ํ•™์Šต์ด ์ข…๋ฃŒ๋œ๋‹ค. pendubot์˜ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ๊ด€์ธก ๊ฐ€๋Šฅํ•œ ์ƒํƒœ ์ •๋ณด๋Š” 3์žฅ์— ๊ธฐ์ˆ ๋œ ๋น„์„ ํ˜• ์ƒํƒœ๋ฐฉ์ •์‹์— ๋”ฐ๋ผ <$\theta_{1},\: \theta_{2},\: \dot{\theta}_{1},\: \dot{\theta}_{2}$>๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋•Œ $\theta_{1},\: \theta_{2}$๋Š” ๋ณด์ƒํ•จ์ˆ˜์˜ ์›ํ™œํ•œ ์„ค๊ณ„๋ฅผ ์œ„ํ•ด ๋‚˜๋จธ์ง€ ์—ฐ์‚ฐ์„ ์ ์šฉํ•˜์—ฌ $-\pi <\theta <\pi$์˜ ๋ฒ”์œ„๋กœ ์ œํ•œํ•œ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ, $\theta_{1},\: \theta_{2}$๋Š” ํ•™์Šต ๊ณผ์ •์—์„œ ์ •๊ทœํ™”์™€ ์—ฐ์†์„ฑ์˜ ์ด์ ์„ ์–ป๊ธฐ ์œ„ํ•ด $\sin(\theta_{i}),\: \cos(\theta_{i})$์˜ ํ˜•ํƒœ๋กœ ์žฌ๊ตฌ์„ฑํ•œ๋‹ค.

๊ทธ๋ฆฌ๊ณ  3์žฅ์—์„œ ์–ธ๊ธ‰ํ•œ 3๊ฐ€์ง€ ๊ท ํ˜•์ ์„ $\tau\in\left\{-\dfrac{\pi}{6},\: 0,\: \dfrac{\pi}{6}\right\}$๋กœ ์ •์˜ํ•œ๋‹ค. $\tau$์˜ ๊ฐ ์š”์†Œ๋Š” 3๊ฐœ์˜ ๊ท ํ˜•์ ๊ณผ ์ผ๋Œ€์ผ ๋Œ€์‘๋˜๋ฉฐ, ํ•ด๋‹น ๊ฐ’์— ๋”ฐ๋ผ ์—์ด์ „ํŠธ๋Š” ์ž์‹ ์ด ์–ด๋Š ๋ชฉํ‘œ ๊ท ํ˜•์ ์œผ๋กœ ์ด๋™ํ•ด์•ผ ํ•˜๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๊ฐ€ ๋งค timestep๋งˆ๋‹ค ๊ด€์ธกํ•˜๋Š” ์žฌ๊ตฌ์„ฑ๋œ ์ƒํƒœ ์ •๋ณด๋Š” <$\sin(\theta_{1}),\: \cos(\theta_{1}),\: \sin(\theta_{2}),\: \cos(\theta_{2}),\: \dot{\theta}_{1},\: \dot{\theta}_{2},\: \tau$>๋กœ ๊ตฌ์„ฑ๋œ 7๊ฐœ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ๋œ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” ํ•ด๋‹น ์ƒํƒœ ์ •๋ณด๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ํ–‰๋™ ์ •์ฑ…์— ๋”ฐ๋ฅธ ํ–‰๋™์ธ ์ œ์–ด๋Ÿ‰์„ ์ถœ๋ ฅํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ ์ถœ๋ ฅ๋˜๋Š” ์ œ์–ด๋Ÿ‰์€ ๋ชจํ„ฐ์˜ ๊ฐ๊ฐ€์†๋„ ๊ฐ’ $u$์ด๋ฉฐ, ์‹ค์ œ ์‹œ์Šคํ…œ ๊ตฌ๋™๊ธฐ์˜ ํ•œ๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ $-100<u<100$์œผ๋กœ ์ œํ•œํ•œ๋‹ค.

ํ‘œ 1 ์‹คํ—˜์— ์‚ฌ์šฉ๋œ pendubot์˜ ํŒŒ๋ผ๋ฏธํ„ฐ

Table 1 Parameters of the pendubot used in the experiment

Parameter

Value

$L_{1}$

0.1645 [m]

$I_{2}$

7.9454e-04 [kgm2]

$m_{2}$

0.1592 [kg]

$l_{2}$

0.1512 [m]

$c$

1.6626e-05

4.2 ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ ๊ตฌํ˜„

4.2.1 ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜

๋ณธ ์—ฐ๊ตฌ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” Truncated Quantile Critics(TQC) ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ๊ตฌํ˜„๋˜์—ˆ๋‹ค[19]. TQC ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์—ฐ์†์  ํ–‰๋™ ๊ณต๊ฐ„์„ ๋Œ€์ƒ์œผ๋กœ ํ•œ ์ •์ฑ… ์ตœ์ ํ™” ๋ฌธ์ œ์— ํšจ๊ณผ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ, ๊ธฐ์กด Quantile Regression DeepQ-Network(QR-DQN)[20], Soft-Actor-Critic[21] ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์žฅ์ ๋“ค์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ณผ๋Œ€ํ‰๊ฐ€ ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๊ณ , ๋” ์ •๋ฐ€ํ•œ ๋ณด์ƒ ๋ถ„ํฌ ์ถ”์ •์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ทน๋‹จ์ ์œผ๋กœ ๋†’์€ ๋ณด์ƒ์„ ๊ธฐ๋Œ€ํ•˜๋Š” ํ–‰๋™์„ ์„ ํƒํ•˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜์—ฌ ๋”์šฑ ํ˜„์‹ค์ ์ธ ๊ธฐ๋Œ€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ •์ฑ…์„ ๊ฒฐ์ •ํ•˜๋„๋ก ์œ ๋„ํ•œ๋‹ค. ํ•ด๋‹น ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์€ ํ‘œ 2์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

TQC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๊ตฌํ˜„๋œ ๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” Python์œผ๋กœ ๊ตฌํ˜„๋œ pendubot ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ๊ณผ ์ง€์†์ ์ธ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•˜๋ฉฐ swing-up ์ œ์–ด๋ฅผ ์œ„ํ•œ ๊ธฐ๋ฒ•์„ ํ•™์Šตํ•œ๋‹ค. ํ•ด๋‹น ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ตฌํ˜„ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์€ ํ‘œ 2์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฐธ๊ณ ๋ฌธํ—Œ [19]์— ๋”ฐ๋ฅด๋ฉด, Number of ciritics์ธ $N$์ด 3์ผ ๋•Œ์˜ ํ•™์Šต ๊ฒฐ๊ณผ๊ฐ€ $N=5,\: N=10$์ผ ๋•Œ์˜ ํ•™์Šต ๊ฒฐ๊ณผ์™€ ์œ ์‚ฌํ•˜๋ฏ€๋กœ $N=3$์œผ๋กœ ์„ ์ •ํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ๋‹ค. ๋˜ํ•œ, 3๊ฐ€์ง€ ๊ท ํ˜•์ ์— ๋Œ€ํ•œ ์ œ์–ด๋ฅผ ๋ชจ๋‘ ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์ œ์–ด๊ธฐ์— ํ•ด๋‹นํ•˜๋Š” policy network์˜ size๋Š” ํ‚ค์› ๋‹ค. ์ด ์™ธ์˜ ๊ฐ’๋“ค์€ ๋ชจ๋‘ ์ฐธ๊ณ ๋ฌธํ—Œ [19]์˜ ์ €์ž๋“ค๊ณผ ๋™์ผํ•œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

ํ‘œ 2 ๋ณธ ์‹คํ—˜์— ์‚ฌ์šฉ๋œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ

Table 2 Hyperparameter that used in this experiment

HyperParameter

Value

Optimizer

Adam

Learning rate

0.0003

Discount factor ($\gamma$)

0.99

Replay buffer size

1e-6

Number of critics ($N$)

3

Number of hidden layers in critic networks

3

Size of hidden layers in policy networks

512

Size of hidden layers in 1st policy networks

2

Size of hidden layers in 2nd policy networks

400

Minibatch size

300

Nonlinearity

ReLU

Target smoothing coefficient ($\beta$)

0.005

Number of atoms ($M $)

25

4.2.2 ๋ณด์ƒํ•จ์ˆ˜ ์„ค๊ณ„ ๋ฐ ํ•™์Šต ๊ฒฐ๊ณผ

๊ฐ•ํ™”ํ•™์Šต ์—์ด์ „ํŠธ๋Š” ํ™˜๊ฒฝ๊ณผ ์ƒํ˜ธ์ž‘์šฉ์ด ์ผ์–ด๋‚˜๋Š” ์ˆœ๊ฐ„๋งˆ๋‹ค ๊ทธ ์‹œ์ ์˜ ๋ณด์ƒ์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ์ž์‹ ์˜ ํ–‰๋™ ์ •์ฑ…์„ ๊ฐœ์„ ํ•œ๋‹ค. ์ด๋•Œ ๋ณด์ƒ ๊ฐ’์„ ์‚ฐ์ถœํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” ๋ณด์ƒํ•จ์ˆ˜๋Š” pendubot์— ์กด์žฌํ•˜๋Š” ์ด๋ฒˆ ์‹คํ—˜์— ์‚ฌ์šฉ๋  3๊ฐœ์˜ ๊ท ํ˜•์ (equilibrium point) ์ค‘ ์–ด๋–ค ๊ท ํ˜•์ ์— ๋„๋‹ฌํ• ์ง€์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง„๋‹ค. ์—ฌ๊ธฐ์„œ ๊ท ํ˜•์ ์€ 2๋‹จ ๋งํฌ๊ฐ€ ๋ฐ”๋‹ฅ์œผ๋กœ ๋Š˜์–ด์ง„ ์•ˆ์ •ํ•œ ๊ท ํ˜•์ ๊ณผ ๋„๋ฆฝ ์ƒํƒœ์˜ ๊ท ํ˜•์ ์ด ์žˆ์œผ๋ฉฐ, ๋ณธ ์‹คํ—˜์—์„œ ๊ตฌํ˜„ํ•˜๊ณ ์ž ํ•˜๋Š” ๊ท ํ˜•์ ์€ ๋„๋ฆฝ ์ƒํƒœ์˜ ๊ท ํ˜•์ ์ด๋‹ค.

ํ‘œ 3์€ ๋ชฉํ‘œ ๊ท ํ˜•์ ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์ธ $\tau$์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋Š” ๊ฐ ๋งํฌ์˜ ๊ฐ๋„ $\theta_{i}^{*}$๋ฅผ ๋‚˜ํƒ€๋‚ด๊ณ , ๋ณด์ƒํ•จ์ˆ˜๋Š” ์‹ (7)์˜ ํ˜•ํƒœ๋กœ ๋‚˜ํƒ€๋‚ธ๋‹ค.

ํ‘œ 3 ๋ชฉํ‘œ ๊ท ํ˜•์ ์— ๋”ฐ๋ฅธ ๊ฐ ๋งํฌ์˜ ๊ฐ๋„

Table 3 Each linkโ€™s angle according to target equilibrium point

$\tau$

target angle

$\theta_{1}^{*}$ $\theta_{2}^{*}$
$-\dfrac{\pi}{6}$ $-\dfrac{\pi}{6}$ $\dfrac{\pi}{6}$
$0$ $0$ $0$
$\dfrac{\pi}{6}$ $\dfrac{\pi}{6}$ $-\dfrac{\pi}{6}$
(7)
$ R_{\theta_{1}}=0.5+0.5*\cos(\theta_{1}-\theta_{1}^{*})\\R_{\theta_{2}}=0.5+0.5*\cos(\theta_{2}+\theta_{2}^{*})\\R_{\dot{\theta_{1}}}=\exp(-0.1*\left |\dot{\theta}_{1}\right |)\\R_{\dot{\theta_{2}}}=\exp(-0.1*\left |\dot{\theta}_{2}\right |)\\R_{u}=\exp(-0.003 | u |)$

์ตœ์ข…์ ์ธ ๋ณด์ƒ ํ•จ์ˆ˜๋Š” ๊ฐ ์š”์†Œ๋ฅผ ๋ชจ๋‘ ๊ณฑํ•œ ํ˜•ํƒœ๋กœ ํ‘œํ˜„๋œ๋‹ค.

(8)
$Reward=R_{\theta_{1}}R_{\theta_{2}}R_{\dot{\theta_{1}}}R_{\dot{\theta_{2}}}R_{\tau}$

๋ณด์ƒํ•จ์ˆ˜๋ฅผ ์ด๋ฃจ๋Š” ๊ฐ๊ฐ์˜ ์š”์†Œ๋Š” ๊ทธ๋ฆผ 7์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ชจ๋“  ์š”์†Œ๋“ค์€ [0, 1]์˜ ๊ฐ’์œผ๋กœ ์ •๊ทœํ™” ๋˜์–ด์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ•ด๋‹น ์š”์†Œ๋“ค์˜ ๊ณฑ๋“ค๋กœ ์ด๋ฃจ์–ด์ง„ $Reward$ ๋˜ํ•œ [0, 1]์˜ ๊ฐ’์„ ๊ฐ€์ง€๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ ํ•œ ์—ํ”ผ์†Œ๋“œ๋Š” ์ตœ๋Œ€ 1000๊ฐœ์˜ timestep์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฏ€๋กœ, ํ•œ ์—ํ”ผ์†Œ๋“œ์—์„œ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ์ตœ๋Œ€ ๋ณด์ƒ์€ 1000์ด ๋œ๋‹ค. ์ด๋•Œ, ๋„๋‹ฌํ•˜๊ณ ์ž ํ•˜๋Š” ๊ท ํ˜•์ ์˜ ๊ฐ๋„์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋Š” ๊ฐ’์ธ $\tau$์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋Š” ๊ฐ’์ธ $\theta_{i}^{*}$๋ฅผ ํฌํ•จํ•˜์ง€ ์•Š๋Š”

๊ทธ๋ฆผ 7. ๋ณด์ƒํ•จ์ˆ˜ ๊ทธ๋ž˜ํ”„

Fig. 7. Reward function graph

../../Resources/kiee/KIEE.2025.74.1.118/fig7.png

๊ทธ๋ฆผ 8. ํ•™์Šต ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„

Fig. 8. Learning results graph

../../Resources/kiee/KIEE.2025.74.1.118/fig8.png

$R_{\dot{\theta}_{1}},\: R_{\dot{\theta}_{2}}$๋Š” 0์— ์ˆ˜๋ ดํ• ์ˆ˜๋ก ๋ณด์ƒ์ด ์ฆ๊ฐ€ํ•˜๋Š” ํ˜•ํƒœ๋ฅผ ๊ฐ–๋„๋ก ํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด pendubot์€ ๊ท ํ˜•์ ์— ์ตœ์†Œํ•œ์˜ ์›€์ง์ž„์„ ๊ฐ–๋„๋ก ํ•˜์—ฌ ์ตœ์†Œ์˜ ์ œ์–ด๋Ÿ‰์„ ๊ฐ–๋„๋ก ํ•œ๋‹ค. $\theta_{i}^{*}$์— ๋”ฐ๋ผ ๋ณด์ƒ์ด ๋‹ฌ๋ผ์ง€๋Š” $R_{\theta_{1}},\: R_{\theta_{2}},\: R_{\tau}$๋Š” ๋ชฉํ‘œํ•œ ๊ท ํ˜•์ ์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ๋†’์€ ๋ณด์ƒ์„ ๋ฐ›๋Š”๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์—์ด์ „ํŠธ๋Š” ์ƒํƒœ ๋ณ€์ˆ˜ $\tau$๋กœ ๊ฒฐ์ •๋˜๋Š” ๋ชฉํ‘œ ๊ท ํ˜•์ ์— ๋„๋‹ฌํ•˜๊ธฐ ์œ„ํ•ด ์ž์‹ ์˜ ํ–‰๋™ ์ •์ฑ…์„ ๊ฐœ์„ ํ•œ๋‹ค. ๊ทธ๋ฆผ 8์€ ์•ž์„œ ์„ค๋ช…๋œ ์กฐ๊ฑด์—์„œ ๋งค ์—ํ”ผ์†Œ๋“œ๋งˆ๋‹ค $\tau$๊ฐ’์— ๋ณ€ํ™”๋ฅผ ์ฃผ๋ฉฐ ๋ฐ˜๋ณต์ ์ธ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ๋‹ค.

4.3 ์‹คํ—˜ ํ™˜๊ฒฝ

์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•œ ์‹คํ—˜์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ๋จผ์ €, pendubot์ด $\theta_{1}=0$์ธ ๊ท ํ˜•์ ์œผ๋กœ ์ด๋™ํ•˜๋Š” swing-up ์ œ์–ด๋ฅผ ์ˆ˜ํ–‰ํ•œ ๋’ค, ์ฐจ๋ก€๋Œ€๋กœ $\theta_{1}=-\dfrac{\pi}{6}$, $\theta_{1}=\dfrac{\pi}{6}$์ธ ๊ท ํ˜•์ ์œผ๋กœ ์ด๋™ํ•˜๋Š” ์ฒœ์ด์ œ์–ด๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ $\theta_{1}=0$์ธ ๊ท ํ˜•์ ์—์„œ ์™ธ๋ž€์„ ์ธ๊ฐ€ํ•ด 2๋‹จ ๋งํฌ๊ฐ€ ๊ท ํ˜•์ ์„ ์ดํƒˆํ–ˆ์„ ๋•Œ, ์›๋ž˜์˜ ๊ท ํ˜•์ ์œผ๋กœ ๋Œ์•„์˜ค๋Š” Recovery ํŠน์„ฑ[13]์„ ํ™•์ธํ•œ๋‹ค.

๊ทธ๋ฆผ 9๋Š” ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ ๊ฒ€์ฆ ์‹คํ—˜์ธ pendubot ์ œ์–ด ์‹คํ—˜์„ ์œ„ํ•œ Simulink ๋ชจ๋ธ์ด๋ฉฐ, ๊ทธ๋ฆผ 10์€ hardware interface ๊ธฐ๋Šฅ์„ LW-RCP์˜ library block์œผ๋กœ ๊ตฌํ˜„ํ•œ ๋ชจ์Šต์ด๋‹ค.

๊ทธ๋ฆผ 9. Pendubot ์‹œ์Šคํ…œ ์ œ์–ด ์‹คํ—˜ Simulink ๋ชจ๋ธ

Fig. 9. Pendubot system control experiment Simulink model

../../Resources/kiee/KIEE.2025.74.1.118/fig9.png

๊ทธ๋ฆผ 10. Hardware interface Simulink ๋ชจ๋ธ

Fig. 10. Hardware interface Simulink model

../../Resources/kiee/KIEE.2025.74.1.118/fig10.png

๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” matlab function block์œผ๋กœ ๊ตฌํ˜„๋˜์—ˆ์œผ๋ฉฐ, ์ด๋Š” ํ•™์ƒ๋“ค์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก library block์˜ ํ˜•ํƒœ๋กœ ์ œ๊ณต๋œ๋‹ค. ํ•ด๋‹น ์ œ์–ด๊ธฐ๋Š” pendubot์˜ 4๊ฐ€์ง€ ์ƒํƒœ๋ณ€์ˆ˜, ๋ชฉํ‘œ ๊ท ํ˜•์ , ์ œ์–ด๋Ÿ‰์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š”๋‹ค. ๋ชจ๋“  ์ƒํƒœ๋ณ€์ˆ˜ ๊ฐ’๋“ค์€ ๊ทธ๋ฆผ 10์˜ receive block์„ ํ†ตํ•ด ์ˆ˜์ง‘๋˜๋ฉฐ, ๋ชฉํ‘œ ๊ท ํ˜•์ ์€ ์‚ฌ์šฉ์ž์˜ ์ง๊ด€์ ์ธ ์ดํ•ด๋ฅผ ์œ„ํ•ด 60๋ถ„๋ฒ•์œผ๋กœ ํ‘œํ˜„๋œ ๊ฐ๋„๋ฅผ ํ˜ธ๋„๋ฒ•์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ž…๋ ฅ๋œ๋‹ค. ์ œ์–ด๊ธฐ๋Š” ์ž…๋ ฅ๋œ ์ •๋ณด๋“ค์„ ๋ฐ”ํƒ•์œผ๋กœ 1๋‹จ๋ถ€ ๋งํฌ์˜ ํšŒ์ „ ๊ฐ๊ฐ€์†๋„์ธ $\ddot{\theta}_{1}$์„ ์ถœ๋ ฅํ•˜๋ฉฐ, ์ด๋Š” ์ ๋ถ„๋˜์–ด ๊ตฌ๋™๋ถ€ ๋ชจํ„ฐ์˜ ์†๋„ ์ง€๋ น์น˜๋กœ ์‚ฌ์šฉ๋œ๋‹ค. ๊ตฌ๋™๋ถ€ ๋ชจํ„ฐ๋Š” PI ์†๋„ ์ œ์–ด๊ธฐ๋ฅผ ํ†ตํ•ด ์ œ์–ด๋˜๋ฉฐ, ๋ชจํ„ฐ ๊ตฌ๋™์„ ์œ„ํ•œ PWM ์‹ ํ˜ธ๋Š” ๊ทธ๋ฆผ 10์˜ Send block์„ ํ†ตํ•ด ์ถœ๋ ฅ๋œ๋‹ค.

4.4 ์‹คํ—˜ ๊ฒฐ๊ณผ

๊ทธ๋ฆผ 11์€ pendubot ์ œ์–ด ์‹คํ—˜์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ์—ฐ๊ตฌ์‹ค Youtube ์˜์ƒ์˜ ์ผ๋ถ€๋‹ค. ์‹ค์ œ Youtube ์˜์ƒ์˜ ์ฃผ์†Œ๋Š”https://youtu.be/G9ik_Ha4E70 ์ด๋‹ค. (์˜์ƒ ์ œ๋ชฉ : Sim-to-real reinforcement learning control of a pendubot, Channel ์ด๋ฆ„: Embedded Control Lab.) ๊ทธ๋ฆผ 12๋Š” 4.3์ ˆ์—์„œ ์–ธ๊ธ‰๋œ swing-up ๋ฐ ์ฒœ์ด ์ œ์–ด, Recovery ํŠน์„ฑ์— ๊ด€ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค. ๊ทธ๋ž˜ํ”„์˜ x์ถ•์€ ์‹œ๊ฐ„์„, y์ถ•์€ ๊ฐ๊ฐ $\theta_{1},\: \theta_{2}$๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. swing-up ๋ฐ ์ฒœ์ด์ œ์–ด ์‹คํ—˜์—์„œ๋Š” ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๊ฐ€ pendubot์ด ๋ชฉํ‘œ ๊ท ํ˜•์ ์œผ๋กœ ์ด๋™ํ•˜๋„๋ก swing-up๊ณผ ์ฒœ์ด์ œ์–ด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ํ‰๊ท ์ ์œผ๋กœ $0.02[rad]$์˜ ์˜ค์ฐจ๋ฅผ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆผ 12์—์„œ ์ ์„ ์œผ๋กœ ํ‘œ๊ธฐ๋œ ์‹œ์ ์€ ์™ธ๋ž€์ด ๊ฐ€ํ•ด์ง„ ์ˆœ๊ฐ„์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์‹คํ—˜ ์‹œ์ž‘ ํ›„ ์•ฝ 17์ดˆ๊ฒฝ์— ์™ธ๋ž€์„ ์ธ๊ฐ€ํ•œ๋‹ค. ์™ธ๋ž€์ด ๊ฐ€ํ•ด์ง„ ์ดํ›„, ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๊ฐ€ $\theta_{1}=0$์ธ ๊ท ํ˜•์ ์œผ๋กœ์˜ swing-up ์ œ์–ด๋ฅผ ์‹œ๋„ํ•˜์—ฌ ๋ชฉํ‘œ ๊ท ํ˜•์ ์œผ๋กœ ํšŒ๋ณตํ•˜๋Š” Recovery ํŠน์„ฑ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜์ƒ๊ณผ ๊ทธ๋ž˜ํ”„๋ฅผ ํ†ตํ•ด ํ™•์ธํ•  ์ˆ˜ ์žˆ๋“ฏ ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ์†์‰ฝ๊ฒŒ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ์˜ Recovery ํŠน์„ฑ์„ ์ง์ ‘ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ์ œ์•ˆํ•œ ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

๊ทธ๋ฆผ 11. ์‹คํ—˜ ๊ณผ์ •์„ ๋ณด์—ฌ์ฃผ๋Š” Youtube ์˜์ƒ

Fig. 11. Youtube video for the experiment

../../Resources/kiee/KIEE.2025.74.1.118/fig11.png

๊ทธ๋ฆผ 12. Pendubot ์ œ์–ด ์‹คํ—˜ ๊ฒฐ๊ณผ

Fig. 12. Result of pendubot control experiment

../../Resources/kiee/KIEE.2025.74.1.118/fig12-1.png

../../Resources/kiee/KIEE.2025.74.1.118/fig12-2.png

5. ๊ฒฐ ๋ก 

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Python๊ณผ LW-RCP๋ฅผ ์ด์šฉํ•œ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด ๊ต์œก ํ”Œ๋žซํผ์„ ์ œ์•ˆํ•˜๊ณ  ๊ทธ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ณ ์ž ํ–ˆ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ์‹ ๊ฒฝ๋ง์€ Python์—์„œ Sim-to-Real ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ํ•™์Šต๋˜์—ˆ์œผ๋ฉฐ, ํ•™์Šต์ด ๋๋‚œ ์‹ ๊ฒฝ๋ง์€ Matlab/Simulink ํ™˜๊ฒฝ์— ํ˜ธํ™˜๋˜๋Š” ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜๋œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ค๋ฌผ ์‹œ์Šคํ…œ ์ œ์–ด๋ฅผ ์œ„ํ•œ Simulink ๋ชจ๋ธ์€ ํฌ๊ฒŒ ๊ตฌ๋™๋ถ€ ์ œ์–ด๋ฅผ ๋‹ด๋‹นํ•˜๋Š” ๋ถ€๋ถ„๊ณผ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋กœ ๋‚˜๋‰˜๋Š”๋ฐ, ๊ตฌ๋™๋ถ€ ์ œ์–ด๋Š” LW-RCP์—์„œ ์ œ๊ณตํ•˜๋Š” library block์œผ๋กœ ๊ตฌ์„ฑ๋˜์—ˆ์œผ๋ฉฐ, ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด๊ธฐ๋Š” matlab function block์œผ๋กœ ๊ตฌํ˜„๋œ ์‹ ๊ฒฝ๋ง์œผ๋กœ ์ œ์ž‘๋˜์—ˆ๋‹ค.

์ดํ›„, ํ•ด๋‹น ํ”Œ๋žซํผ์˜ ํšจ์šฉ์„ฑ ๊ฒ€์ฆ์„ ์œ„ํ•œ pendubot ์ œ์–ด ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•œ ๊ฒฐ๊ณผ, swing-up ๋ฐ ์ฒœ์ด ์ œ์–ด, Recovery ํŠน์„ฑ ์‹คํ—˜ ๋ชจ๋‘ ์„ฑ๊ณต์ ์ธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์ด ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ œ์–ด ๊ต์œก์— ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ํ”Œ๋žซํผ์„ ํ†ตํ•ด ๊ฐ•ํ™”ํ•™์Šต์„ ์‹ค๋ฌผ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜๋Š” ๊ต์œก๊ณผ์ • ์šด์˜์— ํฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.

Acknowledgements

์ด ์„ฑ๊ณผ๋Š” ์ •๋ถ€(๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€)์˜ ์žฌ์›์œผ๋กœ ํ•œ๊ตญ์—ฐ๊ตฌ์žฌ๋‹จ์˜ ์ง€์›์„ ๋ฐ›์•„ ์ˆ˜ํ–‰๋œ ์—ฐ๊ตฌ์ž„(RS-2024-00347193).

References

1 
D. Silver, et al., โ€œMastering the game of Go with deep neural networks and tree search,โ€ Nature, vol. 529, no. 7587, pp. 484-489, 2016. DOI:10.1038/nature16961DOI
2 
L. P. Kaelbling, M. L. Littman, A. W. Moore, โ€œReinforcement learning: A survey,โ€ Journal of artificial intelligence research, vol. 4, pp. 237-285, 1996. DOI:10.1613/jair.301DOI
3 
B. Osiล„ski et al., โ€œSimulation-Based Reinforcement Learning for Real-World Autonomous Driving,โ€ 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, pp. 6411-6418, 2020. DOI:10.1109/ICRA40945.2020.9196730DOI
4 
T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, M. Hutter, โ€œLearning robust perceptive locomotion for quadrupedal robots in the wild,โ€ Science robotics, vol. 7, no. 62, eabk2822, 2022. DOI:10.1126/scirobotics.abk2822DOI
5 
S. A, A. N. Rafferty, et al., โ€œReinforcement learning for education: Opportunities and challenges,โ€ arXiv preprint arXiv:2017.08828, 2021. DOI:10.48550/arXiv.2107.08828DOI
6 
B. Fahad Mon, A. Wasfi, et al., โ€œReinforcement Learning in Education: A Literature Review,โ€ Informatics, vol. 10, no. 3, pp. 74-95, 2023. DOI:10.3390/informatics10030074DOI
7 
H. Gharbi, L. Elaachak, A. Fennan, โ€œReinforcement Learning Algorithms and Their Applications in Education Field: A Systematic Review,โ€ The Proceedings of the International Conference on Smart City Applications, pp. 410-418, 2023. DOI:10.1007/978-3-031-54376-0_37DOI
8 
H. Shin, and H. Oh, โ€œNeural Network Model Compression Algorithms for Image Classification in Embedded Systems,โ€ The Journal of Korea Robotics Society, vol. 17, no. 2, pp. 133-141, 2022. DOI:10.7746/jkros.2022.17.2.133DOI
9 
W. Zhao, J. P. Queralta, and T. Westerlund, โ€œSim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey,โ€ 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737-744, 2020. DOI:10.1109/SSCI47803.2020.9308468DOI
10 
M. A. Perez-Cisneros, R. Leal-Ascencio, and P. A. Cook, โ€œReinforcement learning neurocontroller applied to a 2-DOF manipulator,โ€ Proceeding of the 2001 IEEE International Symposium on Intelligent Control (ISICโ€™01), pp. 56-61, 2001. DOI:10.1109/ISIC.2001.971484DOI
11 
Y. Cheng, P. Zhao, F. Wang, D. J. Block, and Hovakimyan, โ€œImproving the Robustness of Reinforcement Learning Policies With L1 Adaptive Control,โ€ IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6574-6581, 2022. DOI:10.1109/LRA.2022.3169309DOI
12 
T. Lee, D. Ju, and Y. S. Lee, โ€œDevelopment Environment of Reinforcement Learning-based Controllers for Real-world Physical Systems Using LW-RCP,โ€ Journal of Institute of Control, Robotics and Systems (in Korean), vol. 29, no. 7, pp. 543-549, 2023. DOI:10.5302/J.ICROS.2023.23.0045DOI
13 
T. Lee, D. Ju, and Y. S. Lee, โ€œSim-to-Real Reinforcement Learning Techniques for Double Inverted Pendulum Control with Recovery Property,โ€ The transactions of The Korean Institute of Electrical Engineers (in Korean), vol. 72, no. 12, pp. 1705-1713, 2023. DOI:10.5370/KIEE.2023.72.12.1705DOI
14 
G. Dulac-Arnold, N. Levine, D. J. Mankowits, J. Li, C. Paduraru, S. Gowal, and T. Hester, โ€œChallenges of real-world reinforcement learning: definitions, benchmarks and analysis,โ€ Machine Learning, vol. 110, no. 9, pp. 2419-2648, 2021. DOI:10.1007/s10994-021-05961-4DOI
15 
A. Paszke, S. Gross, F. Massa, and et al., โ€œPytorch: An impressive style, high-performance deep learning library,โ€ Advances in Neural Information Processing Systems, vol. 32, 2019.URL
16 
M. Abadi, P. Barham, J. Chen, and et al., โ€œTensorflow: A system for large-scale machine learning,โ€ Osdi, vol. 16, no. 2016, pp 265-283, 2016.URL
17 
Y. S. Lee, D. Ju, C. Choi, โ€œDevelopment of Educational Environment to Improve Efficiency of Online Education on Control Systems,โ€ Journal of Institute of Control, Robotics and Systems (in Korean), vol. 27, no. 12, pp. 1056-1063, 2021. DOI:10.5302/J.ICROS.2021.21.0199DOI
18 
Y. Fujiyama, D. Ju and Y. S. Lee, โ€œThe Implementation of a Ball and Plate System using a 3-DOF Stewart Platform and LW-RCP,โ€ The transactions of The Korean Institute of Electrical Engineers (in Korean), vol. 72, no. 8, pp. 943-951, 2023. DOI:10.5370/KIEE.2023.72.8.943DOI
19 
A. Kuznetsov, P. Shvechikov, A. Grishin and D. Vetrov, โ€œControlling overestimation bias with truncated mixture of continuous distributional quantile critics,โ€ in International Conference on Machine Learning, PMLR, pp. 5556-5566, 2020.URL
20 
W. Dabney, M. Rowland, M. Bellemare, and Munos R, โ€œDistributional reinforcement learning with quantile regression,โ€ in Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, pp. 2892-2901, 2018. DOI:10.1609/aaai.v32i1.11791DOI
21 
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, โ€œSoft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,โ€ International conference on machine learning. PMLR, pp. 1861-1870, 2018.URL

์ €์ž์†Œ๊ฐœ

์ด์ข…๋ฒ”(Jongbeom Lee)
../../Resources/kiee/KIEE.2025.74.1.118/au1.png

He received B.S. degree in electrical engineering from Inha university in 2024. He is now a M.S. candidate in electrical and computer engineering at Inha university. His research interests include optimal control, reinforcement learning and embedded systems.

์ดํƒœ๊ฑด (Taegun Lee)
../../Resources/kiee/KIEE.2025.74.1.118/au2.png

He received B.S. degree in electrical engineering from Inha university in 2023. He is now a M.S. candidate in electrical and computer engineering at Inha university. His research interests include reinforcement learning, embedded systems and optimal control.

์ฃผ๋„์œค (Doyoon Ju)
../../Resources/kiee/KIEE.2025.74.1.118/au3.png

He received M.S. degree in electrical and computer engineering from Inha university in 2023. He is now a Ph.D. candidate in electrical and computer engineering at Inha university. His research interests include optimal control, embedded systems and reinforcement learning.

์ด์˜์‚ผ (Young Sam Lee)
../../Resources/kiee/KIEE.2025.74.1.118/au4.png

He received B.S. and M.S. degrees in electrical engineering from Inha University, Incheon, South Korea, in 1999, and the Ph.D. degree in electrical engineering from Seoul National University, South Korea, in 2003. From 2003 to 2004, he was a Senior Researcher with Samsung Electronics Co. Since 2004, he has been with the Department of Electrical and Computer Engineering, Inha University. He is the author of four books and more than 60 articles. His research interests include computer- aided control system designs, rapid control prototyping, control and instrumentation, robot engineering, and embedded systems.