Hi,
I am Sourabh Jain, I am an experienced python programmer. I have also written feed forward neural network (fully connected) in python for analysis of stock price movements. For reinforcement learning, I will take target value and present value as function of sum of black/red pieces with weights . I will minimize the RMS error from target value and every move will update the weights using normalized error. If you would like to discuss this further, I am available.
Regards
Sourabh