I am using SolrCloud 4.10.1 with 3 shards and a replication factor of 2 , meaning I have 6 cores altogether.
I index this SolrCloud using a multi-threaded application and it can get upto 75 concurrent threads using the SolrJ API .
The issue that I am facing currently is that , theoretically the replicas for a given shard should have identical documents , but in my case I end up with disjoint sets for the two replicas of each shard, meaning there is no overlap whatsoever between the replicas of a given shard ,its ok to have some delta between replicas of the same shard, but in my case they are disjoint sets and as I am using a load balancer to query , depending on which replica the request goes to , I some times get a result and sometimes not when using filter queries.
I have tried various options , but I am not able to resolve this issue and ended up turning off the replication altogether , this is not sustainable and can lead to choking of the search requests.
I need you to find out the root cause of this , I am using stock SolrCloud 4.10.1 and I am on CentOS 7.x using Java 1.7 .
I have worked on solr recently, and I know about how to create a collection, or if there is collection, add a core to this collection, we could try to add a core to the collection as a replica to see if the new added replica functions well