Introduction to Apache Lucene/Solr

Introduction to Apache Lucene/Solr. CSCI 572: Information Retrieval and Search Engines Summer 2010. Outline. What is Lucene/Solr? Where did it come from? What are the current versions of Lucene/Solr? What can it do?. Apache Lucene. The brainchild of Doug Cutting

Share Presentation
Embed Code
Link
Download Presentation

rnance

rnance + Follow

Download Presentation

Introduction to Apache Lucene/Solr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

Presentation Transcript

  1. Introduction to Apache Lucene/Solr CSCI 572: Information Retrieval and Search Engines Summer 2010
  2. Outline • What is Lucene/Solr? • Where did it come from? • What are the current versions of Lucene/Solr? • What can it do?
  3. Apache Lucene • The brainchild of DougCutting • Free-text indexing library that implements most of the functionality I’ve talked to you about • Query Models, Ranking, Indexing • Core API is implemented in Java • C++/C, Ruby, Python APIs as well, but small communities or automatically generated • Initially Sourceforge, moved to Apache in 2001
  4. Apache Solr • Originally developed at CNET • Web service layer built on topof Lucene library • Provides schema andunderstanding of field types, conversion to and from representation • Provides huge-scale scalability, deployed on top of application server like Tomcat or Jetty • P/L independent programming APIs • Sharing, replication, faceting, highlighting, explain, more like this and other functionality provided easily
  5. How to get started • Lucene (2.9.2 and 3.0.1 stable) • Put your Java hat on • Have Eclipse ready or your favorite IDE • Download lucene-core-.jar from • http://repo1.maven.org/maven2/org/apache/lucene/ • Download src and build from • http://www.apache.org/dyn/closer.cgi/lucene/java/ • Check out some example Java code that demonstrates indexing and querying from Otis Gospodnetic • http://onjava.com/pub/a/onjava/2003/01/15/lucene.html
  6. How to get started • Solr • Grab a release of Solr (1.4.0 stable) • http://www.apache.org/dyn/closer.cgi/lucene/solr/ • Unpack into e.g., /usr/local/solr • Deploy onto tomcat • Install tomcat into /usr/local/tomcat • Create solr.xml file and drop into /usr/local/tomcat/conf/Catalina/localhost/ • Create solr.home JNDI property and point to /usr/local/solr/solr • Start tomcat • Head over to $solr/example/example-docs • curl http://localhost:8983/solr/update -H 'Content-type:text/xml; charset=utf-8' --data-binary @artists.xml
  7. Modifying your schema.xml • Field Types • Analyzers • Tokenizers http://wiki.apache.org/solr/SchemaXml
  8. Solr Faceting • facet=on&facet.field=&facet.field=… • http://wiki.apache.org/solr/SimpleFacetParameters
  9. Advanced Topics • Standing up cores • Sharding • Replication • Zookeeper and Cloud
  10. Development currently in flux • Stick with release versions • Depending on trunk won’t really help • Lucene and Solr have merged
  11. Wrapup • Lots more information at • http://lucene.apache.org • http://lucene.apache.org/solr/ • http://lucene.apache.org/java/ • Possible projects • Geospatial search • Improving existing code and contributing back to Apache SIS and to Apache Solr • Improving date faceting • Rewriting the ResponseWriter framework
  12. Acknowledgements • Material inspired by discussions and talks on the Apache Mailing lists for Solr, Lucene and through discussions with the rest of the Lucene community

talk-ppt - PowerPoint Presentation

talk-ppt - PowerPoint Presentation

Make your work visible in popular websites, Wikipedia, YouTube, FaceBook, MySpace, Orkut, etc. There are lots of free tools on the web, use them! .

5.95k views • 14 slides

Introduction to Apache Hadoop

Introduction to Apache Hadoop

Introduction to Apache Hadoop. CSCI 572: Information Retrieval and Search Engines Summer 2010. Outline. What is Hadoop? Where did it come from? What are the current versions of Hadoop? What can it do?. Apache Hadoop. The brainchild of Doug Cutting

696 views • 10 slides

Apache Solr

Apache Solr

Apache Solr. Yonik Seeley yonik@apache.org 29 June 2006 Dublin, Ireland. History. Search for a replacement search platform commercial: high license fees open-source: no full solutions CNET grants code to Apache, Solr enters Incubator 17 Jan 2006 Solr is a Lucene sub-project

807 views • 28 slides

NYC Apache Lucene/Solr Meetup

NYC Apache Lucene/Solr Meetup

NYC Apache Lucene/Solr Meetup. Agenda. Welcome "Faster. Better. Solr! What to look for in Solr 1.4“ Yonik Seeley, Lucid Imagination How fast is it? Assessing Performance in Lucene and Solr Mark Miller, Lucid Imagination

463 views • 19 slides

Lucene/Solr Architecture

Lucene/Solr Architecture

Lucene/Solr Architecture. Request Handlers. Response Writers. Update Handlers. /admin. /select. /spell. XML. Binary. JSON. XML. CSV. binary. Extracting Request Handler (PDF/WORD). Search Components. Schema. Update Processors. Query. Highlighting. Signature. Spelling.

572 views • 5 slides

Apache Lucene

Apache Lucene

2. AGENDA. What's a search engineLucene JavaFeaturesCode exampleSolrFeaturesIntegrationNutchFeaturesUsage exampleConclusion and alternative solutions. 3. About the Speaker. Studied computational linguisticsJava developerWorked 3.5 years for an Enterprise Search company (using Lucene Java

912 views • 35 slides

Introduction to Apache Lucene/Solr

Introduction to Apache Lucene/Solr

Introduction to Apache Lucene/Solr. CSCI 572: Information Retrieval and Search Engines Summer 2010. Outline. What is Lucene/Solr? Where did it come from? What are the current versions of Lucene/Solr? What can it do?. Apache Lucene. The brainchild of Doug Cutting

364 views • 12 slides

Apache Lucene

Apache Lucene

Vores tankesæt: 80% teknologi | 20% forretning . Apache Lucene. V 4.0. Anders Lybecker. Consultant Solution Architect KRING Development A/S Expertise .Net SQL Server Freetext Search aly@kringdevelopment.dk | +45 53 72 73 40 |www.lybecker.com/blog. Agenda. Lucene Intro Indexing

416 views • 40 slides

An introduction to Solr

An introduction to Solr

An introduction to Solr. Implementing search with free software. By Mick England. What is Solr?. Solr is an open source enterprise search server based on the Lucene Java search library. Solr runs in a Java servlet container such as Tomcat or Jetty

1.36k views • 9 slides

Introduction to Lucene

Introduction to Lucene

Introduction to Lucene. Rong Jin. What is Lucene ?. Lucene is a high performance, scalable Information Retrieval (IR) library Free, open-source project implemented in Java O riginally written by Doug Cutting Become a project in the Apache Software Foundation in 2001

579 views • 40 slides

Introduction to Lucene & Solr and Use-cases October Solr/Lucene Meetup

Introduction to Lucene & Solr and Use-cases October Solr/Lucene Meetup

Introduction to Lucene & Solr and Use-cases October Solr/Lucene Meetup. Rahul Jain @rahuldausa. Who am I?. Software Engineer 7 years of programming experience Areas of expertise/interest High traffic web applications JAVA/J2EE Big data, NoSQL Information-Retrieval, Machine learning.

589 views • 30 slides

Introduction to Open Source Search with Apache Lucene and Solr

Introduction to Open Source Search with Apache Lucene and Solr

Introduction to Open Source Search with Apache Lucene and Solr. Grant Ingersoll. The How Many Game. How many of you: Have taken a class in Information Retrieval (IR)? Are doing work/research in IR? Have heard of or are using Lucene? Have heard of or are using Solr?

334 views • 20 slides

A Static Rank Framework for Lucene / Solr

A Static Rank Framework for Lucene / Solr

A Static Rank Framework for Lucene / Solr. Mike Schultz mike.schultz@gmail.com. Static Rank for Solr / Lucene. Dynamic Rank Why Static Rank Combining Scores Static Rank Components. Multiple Fields / Multiple Types. PubDate. Continuous (Date, Int , Float, …). I sNews. M ediaType.

625 views • 49 slides

Apache Solr

Apache Solr

Apache Solr. Apache Solr – Introduction . David Shemer. Overview. What is Solr standalone open-source enterprise search server with a “ REST - like ” API, Written in java How it works

530 views • 17 slides

Implementing Local Search with Apache Solr and Lucene

Implementing Local Search with Apache Solr and Lucene

Implementing Local Search with Apache Solr and Lucene. Grant Ingersoll. Topics. Use Cases Concepts of Local Search Local Search support in Apache Solr Indexing Filtering Searching Faceting Sorting Demo. Use Cases. Asset Management Social Networking Find all friends near me

333 views • 13 slides

Introduction to PowerPoint 2007 (PPT)

Introduction to PowerPoint 2007 (PPT)

Introduction to PowerPoint 2007 (PPT). Statement of Responsibility (When, For Whom, By Whom). Selecting, Adding & Deleting slides. To add a slide, click on the “New Slide” icon in Home tab. To work on a slide, go to the slide column (left side of screen)-click on it.

721 views • 17 slides

Apache Solr/Lucene: Looking Ahead

Apache Solr/Lucene: Looking Ahead

Apache Solr/Lucene: Looking Ahead. Topics. Me. You? Quick Overview of Lucen e and Solr Solr demo Where are we now? What’s in a version number? Looking Ahead Apache Lucene 3.1 and beyond Apache Solr 3.1 and beyond. Me. You? Lucene? Solr? New to Search? Other Search Engines?

293 views • 15 slides

Apache Solr

Apache Solr

Apache Solr. We zijn toch allemaal zoekende?. Inhoud. Wat is Apache Solr Configuratie Gebruik ANP Portal SolrAS Vragen. Wat is Apache Solr.

376 views • 14 slides

A Collections Searching Center Using Lucene – Solr

A Collections Searching Center Using Lucene – Solr

A Collections Searching Center Using Lucene – Solr. Ching-hsien Wang Smithsonian Institution Collections.si.edu wangch@si.edu. Background Information. Smithsonian Institution is a public institution whose mission is the increase and diffusion of knowledge ,

420 views • 28 slides

Open-Source Search Engines and Lucene/Solr

Open-Source Search Engines and Lucene/Solr

Open-Source Search Engines and Lucene/Solr. UCSB 290N 2013. Tao Yang Slides are based on Y. Seeley, S. Das, C. Hostetter. Open Source Search Engines. Why? Low cost: No licensing fees Source code available for customization Good for modest or even large data sizes Challenges:

737 views • 52 slides

Apache Lucene and Apache Solr Performance Tuning

Apache Lucene and Apache Solr Performance Tuning

Mark Miller (markrmiller@apache.org). Apache Lucene and Apache Solr Performance Tuning. Brief Intro To. Lucene: Java library for building and searching “inverted” indices. Small, efficient, fast Approx 1 MB jar file. Inverted Index. Think of a book index. Segments. Incremental indexing.

796 views • 32 slides

Lucene/SOLR 1: inleiding + indexering

Lucene/SOLR 1: inleiding + indexering

Lucene/SOLR 1: inleiding + indexering. TU Delft Library Digitale Productontwikkeling. Lucene inleiding Lucene indexering SOLR inleiding SOLR indexering. Egbert Gramsbergen. Wat is Lucene?.

311 views • 17 slides