218x Filetype PDF File size 2.00 MB Source: www.tutorialspoint.com
spaCy i spaCy About the Tutorial spaCy, developed by software developers Matthew Honnibal and Ines Montani, is an open-source software library for advanced NLP (Natural Language Processing). It is written in Python and Cython (C extension of Python which is mainly designed to give C like performance to the Python language programs). spaCy is a relatively new framework but one of the most powerful and advanced libraries used to implement NLP. Audience This tutorial will be useful for graduates, post-graduates, and research students who either have an interest in NLP or have these subjects as a part of their curriculum. The reader can be a beginner or an advanced learner. Prerequisites The reader must have basic knowledge about NLP and artificial intelligence. He/she should also be aware about the basic terminologies used in English grammar and Python programming concepts. Copyright & Disclaimer Copyright 2021 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at contact@tutorialspoint.com i spaCy Table of Contents About the Tutorial ............................................................................................................................................ i Audience ........................................................................................................................................................... i Prerequisites ..................................................................................................................................................... i Copyright & Disclaimer ..................................................................................................................................... i Table of Contents ............................................................................................................................................ ii 1. spaCy — Introduction ............................................................................................................................... 1 Extensions and visualisers ............................................................................................................................... 1 2. spaCy — Getting Started ........................................................................................................................... 4 3. spaCy — Models and Languages ............................................................................................................... 9 4. spaCy — Architecture ............................................................................................................................. 15 5. spaCy — Command Line Helpers ............................................................................................................. 18 6. spaCy — Top-level Functions .................................................................................................................. 32 7. spaCy — Visualization Function .............................................................................................................. 36 8. spaCy — Utility Functions ....................................................................................................................... 44 9. spaCy — Compatibility Functions ............................................................................................................ 59 10. spaCy — Containers ................................................................................................................................ 61 11. spaCy — Doc Class ContextManager and Property .................................................................................. 70 Retokenizer.split ............................................................................................................................................ 72 12. spaCy — Container Token Class .............................................................................................................. 78 13. spaCy — Token Properties ...................................................................................................................... 89 14. spaCy — Container Span Class ................................................................................................................ 95 15. spaCy — Span Class Properties ............................................................................................................. 103 16. spaCy — Container Lexeme Class .......................................................................................................... 110 17. spaCy — Training Neural Network Model ............................................................................................. 117 Steps for Training ........................................................................................................................................ 117 18. spaCy — Updating Neural Network Model ........................................................................................... 120 ii 1. spaCy — Introduction spaCy In this chapter, we will understand the features, extensions and visualisers with regards to spaCy. Also, a features comparison is provided which will help the readers in analysis of the functionalities provided by spaCy as compared to Natural Language Toolkit (NLTK) and coreNLP. Here, NLP refers to Natural Language Processing. What is spaCy? spaCy, which is developed by the software developers Matthew Honnibal and Ines Montani, is an open-source software library for advanced NLP. It is written in Python and Cython (C extension of Python which is mainly designed to give C like performance to the Python language programs). spaCy is a relatively a new framework but, one of the most powerful and advanced libraries which is used to implement the NLP. Features Some of the features of spaCy that make it popular are explained below: Fast: spaCy is specially designed to be as fast as possible. Accuracy: spaCy implementation of its labelled dependency parser makes it one of the most accurate frameworks (within 1% of the best available) of its kind. Batteries included: The batteries included in spaCy are as follows: Index preserving tokenization. “Alpha tokenization” support more than 50 languages. Part-of-speech tagging. Pre-trained word vectors. Built-in easy and beautiful visualizers for named entities and syntax. Text classification. Extensile: You can easily use spaCy with other existing tools like TensorFlow, Gensim, scikit-Learn, etc. Deep learning integration: It has Thinc-a deep learning framework, which is designed for NLP tasks. Extensions and visualisers Some of the easy-to-use extensions and visualisers that comes with spaCy and are free, open-source libraries are listed below: Thinc: It is Machine Learning (ML) library optimised for Central Processing Unit (CPU) usage. It is also designed for deep learning with text input and NLP tasks. 1
no reviews yet
Please Login to review.