Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
reinforcementlearning
Follow
Hide
Posts
Left menu
đ
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
From Pixels to Physicality âď¸: Engineering Olaf with Reinforcement ⨠Learning, Control Systems, and Illusion Design đ¤
Hemant
Hemant
Hemant
Follow
Mar 22
From Pixels to Physicality âď¸: Engineering Olaf with Reinforcement ⨠Learning, Control Systems, and Illusion Design đ¤
#
ai
#
machinelearning
#
rpa
#
reinforcementlearning
Comments
Add Comment
8 min read
[Meta-RL] We told an AI agent 'you can fail 3 times.' Accuracy went up 19%.
nasuy
nasuy
nasuy
Follow
Mar 19
[Meta-RL] We told an AI agent 'you can fail 3 times.' Accuracy went up 19%.
#
ai
#
agents
#
reinforcementlearning
#
machinelearning
4
 reactions
Comments
Add Comment
4 min read
Challenging Dogma: Simple Fine-Tuning Enables Continual Learning in VLA Models
thilak15
thilak15
thilak15
Follow
Mar 13
Challenging Dogma: Simple Fine-Tuning Enables Continual Learning in VLA Models
#
ai
#
machinelearning
#
continuallearning
#
reinforcementlearning
Comments
Add Comment
2 min read
Reinforcement Learning for Robotics: A Comprehensive 2025 Guide
Abhishek Nair
Abhishek Nair
Abhishek Nair
Follow
Mar 15
Reinforcement Learning for Robotics: A Comprehensive 2025 Guide
#
reinforcementlearning
#
robotics
#
rl
#
sac
1
 reaction
Comments
Add Comment
52 min read
How I Built a Readable AlphaZero From Scratch â A Deep Dive Into the Code
Zhixiang Li
Zhixiang Li
Zhixiang Li
Follow
Mar 1
How I Built a Readable AlphaZero From Scratch â A Deep Dive Into the Code
#
alphazero
#
reinforcementlearning
#
deeplearning
#
python
1
 reaction
Comments
Add Comment
10 min read
I Built an AI Arena and Trained AlphaZero to Play Gomoku: Hereâs How
Zhixiang Li
Zhixiang Li
Zhixiang Li
Follow
Mar 1
I Built an AI Arena and Trained AlphaZero to Play Gomoku: Hereâs How
#
ai
#
alphazero
#
deeplearning
#
reinforcementlearning
1
 reaction
Comments
Add Comment
4 min read
Fixing an Off-By-One Bug in PufferLib's PPO Implementation
Jacob Lee
Jacob Lee
Jacob Lee
Follow
Jan 10
Fixing an Off-By-One Bug in PufferLib's PPO Implementation
#
machinelearning
#
reinforcementlearning
#
opensource
#
python
Comments
Add Comment
2 min read
Multi armed bandit exercise 2.5 with C#
davide lettieri
davide lettieri
davide lettieri
Follow
Jan 6
Multi armed bandit exercise 2.5 with C#
#
csharp
#
reinforcementlearning
#
karmedbanditproblem
Comments
Add Comment
4 min read
Sutton & Barto Gridworld example in C#
davide lettieri
davide lettieri
davide lettieri
Follow
Jan 6
Sutton & Barto Gridworld example in C#
#
csharp
#
reinforcementlearning
Comments
Add Comment
5 min read
HRPO-X v1.0.1: from HRPO paper production-hardened runnable code
Kwansub Yun
Kwansub Yun
Kwansub Yun
Follow
Jan 7
HRPO-X v1.0.1: from HRPO paper production-hardened runnable code
#
reinforcementlearning
#
mlops
#
opensource
#
github
Comments
Add Comment
2 min read
đ
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account