Post on 14-Apr-2017
transcript
2
02The magic of creation Any sufficiently advanced technology is indistinguishable from magic
—Arthur C. Clarke
bin/solr start -e techproductsPure magic
1. Start the server2. Setup the collection3. Populate with documents4. Commit5. Profit!
3
03The price of magic
bin/solr start … ???
1. What port is the server running on?2. What is the collection name?3. Is it static or dynamic schema? Or schemaless?4. Which directory is schema configuration in? Data?5. What documents have we populated already?6. Is everything committed?
7. WHY DOES MY QUERY NOT WORK !!! L
4
03Troubleshooting process 1. Troubleshooting is not a linear process2. It is not taught often or well3. Book is coming soon(-ish….)4. Based on my experience as: ���
Solr-based project developer and popularizer���Senior (Weblogic) tech-support for 3 years
5. Hard to explain the book in 40 minutes6. TreeMap is a – slightly - faster mental model7. Adaptation of the Root Cause Analysis8. Top-level concepts described in "The New Rational
Manager" by the Kepner and Tregoe (1997)
5
03Troubleshooting TreeMap
1. Establish the boundaries2. Split the problem3. Identify the relevant part4. Zoom in5. Re-formulate the boundaries6. Repeat 2-5 until fixed
6
03Establishing the boundaries – Root Cause Analysis
Iden
tity
Location
Timing
Magnitude
7
03Boundaries - Identity Identity – action we want to accomplish/problem to solve
Initial (black-box) identity – ���"echoParams is duplicated with example config, sometimes"
Zoomed-in – ���"Any query parameter that is also in request handler's defaults is duplicated"
See SOLR-6780 for full story, a.k.a "an evil freaking bug"
Gets easier with practice
8
03Boundaries - Location
Problem: Solr cannot find customer records
Could be indexing• Record was never sent to Solr• Wrong handler• Invalid schema definition• Incorrect URP pipeline• ...
Could be searching• Query too restrictive• Query too permissive• Searching wrong fields• Searching against catch-all field• ...
Cloud adds many more locations
Location – Place (component) where the problem happens
9
03Boundaries - Timing Timing – when/how often the problem shows itself
Reproducibility1. Always – ideal, reproducible with debugger on, logs on/off2. Seemingly intermittent (a.k.a sometimes) – useless3. On trigger X (e.g. on commit) – nearly as good as always
Onset1. Did the system work at time point X – not at time point Y =>���
What did you change in meanwhile?2. Problem exists != Problem noticed, may have been shadowed
10
03Boundaries - Magnitude Magnitude – WHAT is the extent of the problem
• Latest Solr or a single (or range) of old versions?• Standard example configuration or only with custom schema?• A single node or a whole cluster?
• The more standard/recent config is => the easier it is to troubleshoot
11
03Boundaries – through negation and comparison “I choose a block of marble and chop off whatever I don’t need”
— (sculptor) Auguste Rodin
Clarify the problem by saying what it is NOT as well
1. Example: "This affects Solr 5.1, BUT not Solr 5.2"
2. The BUT part requires testing and may prove to be untrue
3. Thinking of negative condition simplifies/purifies test case
4. Also gives a parallel use-case that works – great for debugging
12
03Practical boundaries – what does the start script do? bin/solr start … ???
1. Do not try to read the script – look at the ground truth2. In Admin UI���
Dashboard -> Versions -> solr-spec (version) ���Dashboard -> JVM -> Args (command line params, abbrev.) ���Collection -> Overview -> Instance (all the directories)
3. On command line (Unix, Mac, and like): ���ps -aef |grep java���/usr/bin/java -server -Xss256k -Xms512m -Xmx512m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/Users/arafalov/SearchEngines/solr-5.3.1/example/techproducts/solr//../logs/solr_gc.log …
4. On Windows: use Microsoft/Sysinternals ProcessExplorer5. Example: SOLR-8073
13
03TreeMap – black box
Indexing Searching
14
03TreeMap – black box
Indexing
15
03TreeMap – indexing - details 1. Choose Request Handler (.e.g /update) ���
UpdateRequestHandler���ExtractingRequestHandler – Tika���
2. Calculate all parameters���URL explicit���Handler params (defaults, appends, invariants) ���Global defaults (initParams) ���Shared param blocks (useParams)���Hardcoded���REST-driven overrides���
3. Execute Request Handler ���Generates standard Solr document���
4. UpdateRequestProcessors (URPs) ���Explicit chain���Parameter-supplied chain���Built-in chain������URPs is where work actually happens
5. Mapping to schema fields Explicit field Dynamic fields CopyFields
6. Commit Manual Delayed (commitWithin) Soft Hard
16
03Boundaries – example - discovering parameters
INFO - [ x:techproducts] ...LogUpdateProcessor; [techproducts] webapp=/solr path=/update params={} {add=[3007WFP (1515103857103863808)]}
DEBUG - [ x:techproducts] ...LogUpdateProcessor; PRE_UPDATE add{,id=3007WFP} {{params(df=text),defaults(wt=xml)}}
solr.log
http://localhost:8983/solr/techproducts/config/
"secret" API to get current config
17
03Boundaries – example - schemaless magic 1. <updateRequestProcessorChain name="add-unknown-fields-to-the-schema">���
2. <processor class="solr.UUIDUpdateProcessorFactory" />3. <processor class="solr.LogUpdateProcessorFactory"/>4. <processor class="solr.DistributedUpdateProcessorFactory"/>5. <processor class="solr.RemoveBlankFieldUpdateProcessorFactory"/>6. <processor class="solr.FieldNameMutatingUpdateProcessorFactory">7. <str name="pattern">[^\w-\.]</str>8. <str name="replacement">_</str>9. </processor>10. <processor class="solr.ParseBooleanFieldUpdateProcessorFactory"/>11. <processor class="solr.ParseDateFieldUpdateProcessorFactory">12. <arr name="format">13. <str>yyyy-MM-dd'T'HH:mm:ss.SSSZ</str>14. <str>yyyy-MM-dd</str>15. </arr>16. </processor>17. <processor class="solr.AddSchemaFieldsUpdateProcessorFactory">18. <str name="defaultFieldType">strings</str>19. <lst name="typeMapping">20. <str name="valueClass">java.lang.Boolean</str>21. <str name="fieldType">booleans</str>22. </lst>23. <lst name="typeMapping">24. <str name="valueClass">java.util.Date</str>25. <str name="fieldType">tdates</str>26. </lst>27. </processor>28. <processor class="solr.RunUpdateProcessorFactory"/>29. </updateRequestProcessorChain>
18
03TreeMap – black box
Searching
19
03TreeMap – searching - details 1. Choose Request Handler (SearchHandler) ���
/query���/export���/browse���
2. Calculate all parameters���URL explicit���Handler params (defaults, appends, invariants) ���Global defaults (initParams) ���Shared param blocks (useParams)���Hardcoded���REST-driven overrides���
3. Search Components��� <arr name="components">
<str>query</str> <str>facet</str> <str>mlt</str> <str>highlight</str> <str>stats</str> <str>debug</str> </arr>
4. Query Parsers standard dismax edismax switch block join surround …. (>20 parsers)
5. Response writers xml json python ruby php velocity csv schema.xml xsort
20
03TreeMap – searching - example
21
03TreeMap – searching - example http://localhost:8983/solr/techproducts/browse?��� q=THIS+is+a+TEST&��� wt=xml&��� echoParams=all&��� debugQuery=true
<str name="parsedquery_toString">���+(((features:this | keywords:this^5.0 | author:this^2.0 | cat:THIS^1.4 | name:this^1.2 | ���manu:this^1.1 | description:this^5.0 | text:this^0.5 | id:THIS^10.0 | resourcename:this |��� title:this^10.0) (features:is | keywords:is^5.0 | author:is^2.0 | cat:is^1.4 | name:is^1.2 | ���manu:is^1.1 | description:is^5.0 | text:is^0.5 | id:is^10.0 | resourcename:is | title:is^10.0) ��� (features:a | keywords:a^5.0 | author:a^2.0 | cat:a^1.4 | name:a^1.2 | manu:a^1.1 | description:a^5.0 | ���text:a^0.5 | id:a^10.0 | resourcename:a | title:a^10.0) (features:test | keywords:test^5.0 | author:test^2.0 |��� cat:TEST^1.4 | name:test^1.2 | manu:test^1.1 | description:test^5.0 | text:test^0.5 | id:TEST^10.0 |��� resourcename:test | sku:test^1.5 | title:test^10.0))~4) ���</str>
22
03TreeMap – searching - tools
23
03TreeMap – searching - tools
24
03TreeMap – Troubleshooting Solr cloud
1. Good luck with exponential complexity increase.2. Try to reproduce in a standalone instance!3. Tools exist, but they are themselves complex (e.g. Jepsen)4. But the TreeMap process is the same overall
Cloud adds many more locations
25
03Troubleshooting – closing notes and review 1. Troubleshooting is both art (intuition) and science2. The more you apply the science, the better you become at the art3. Remember the overall process���
Establish the boundaries���Split the problem���Identify the relevant part���Zoom in���Re-formulate the boundaries���Repeat until fixed/problem identified
4. Remember the boundaries���Identity���Location���Timing���Magnitude
26
03Troubleshooting – next step 1. My resources and mailing list: http://www.solr-start.com/2. Solr-users mailing list and archives���
Identify your boundary in the email3. Books, current and upcoming4. Google/Bing/DDG – use good keywords5. Share what you learned