Practical Natural Language Processing. A Comprehensive Guide to Building Real-World NLP Systems Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta - okładka książki

Practical Natural Language Processing. A Comprehensive Guide to Building Real-World NLP Systems (ebook) (audiobook) (audiobook)

Autorzy:: Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta

Promocja

Wydawnictwo:

O'Reilly Media

Stron:

456

Dostępne formaty:

ePub

Mobi

Ebook

~~239,00 zł~~ (-15%) 203,15 zł

Zamów w sensus.pl (143,40 zł najniższa cena z 30 dni)

(143,40 zł najniższa cena z 30 dni)

Many books and courses tackle natural language processing (NLP) problems with toy use cases and well-defined datasets. But if you want to build, iterate, and scale NLP systems in a business setting and tailor them for particular industry verticals, this is your guide. Software engineers and data scientists will learn how to navigate the maze of options available at each step of the journey.

Through the course of the book, authors Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana will guide you through the process of building real-world NLP solutions embedded in larger product setups. You’ll learn how to adapt your solutions for different industry verticals such as healthcare, social media, and retail.

With this book, you’ll:

Understand the wide spectrum of problem statements, tasks, and solution approaches within NLP
Implement and evaluate different NLP applications using machine learning and deep learning methods
Fine-tune your NLP solution based on your business problem and industry vertical
Evaluate various algorithms and approaches for NLP product tasks, datasets, and stages
Produce software solutions following best practices around release, deployment, and DevOps for NLP systems
Understand best practices, opportunities, and the roadmap for NLP from a business and product leader’s perspective

Sowmya Vajjala pracuje w kanadyjskiej Narodowej Radzie Badań Naukowych. Budowała wielojęzyczne systemy NLP.

Bodhisattwa Majumder jest doktorantem na Uniwersytecie Kalifornijskim w San Diego. Tworzył systemy NLP w Google AI i Microsoft Research.

Anuj Gupta jest dyrektorem w firmie Vahan. Kierował wieloma zespołami zajmującymi się uczeniem maszynowym.

Foreword
Preface
- Why We Wrote This Book
- The Philosophy
- Scope
- Who Should Read This Book
- What You Will Learn
- Structure of the Book
- How to Read This Book
  - Conventions Used in This Book
  - Using Code Examples
  - OReilly Online Learning
  - How to Contact Us
  - Further Information
  - Acknowledgments
I. Foundations
1. NLP: A Primer
- NLP in the Real World
  - NLP Tasks
- What Is Language?
  - Building Blocks of Language
    - Phonemes
    - Morphemes and lexemes
    - Syntax
    - Context
  - Why Is NLP Challenging?
    - Ambiguity
    - Common knowledge
    - Creativity
    - Diversity across languages
- Machine Learning, Deep Learning, and NLP: An Overview
- Approaches to NLP
  - Heuristics-Based NLP
  - Machine Learning for NLP
    - Naive Bayes
    - Support vector machine
    - Hidden Markov Model
    - Conditional random fields
  - Deep Learning for NLP
    - Recurrent neural networks
    - Long short-term memory
    - Convolutional neural networks
    - Transformers
    - Autoencoders
  - Why Deep Learning Is Not Yet the Silver Bullet for NLP
- An NLP Walkthrough: Conversational Agents
- Wrapping Up
2. NLP Pipeline
- Data Acquisition
- Text Extraction and Cleanup
  - HTML Parsing and Cleanup
  - Unicode Normalization
  - Spelling Correction
  - System-Specific Error Correction
- Pre-Processing
  - Preliminaries
    - Sentence segmentation
    - Word tokenization
  - Frequent Steps
    - Stemming and lemmatization
  - Other Pre-Processing Steps
    - Text normalization
    - Language detection
    - Code mixing and transliteration
  - Advanced Processing
- Feature Engineering
  - Classical NLP/ML Pipeline
  - DL Pipeline
- Modeling
  - Start with Simple Heuristics
  - Building Your Model
  - Building THE Model
- Evaluation
  - Intrinsic Evaluation
  - Extrinsic Evaluation
- Post-Modeling Phases
  - Deployment
  - Monitoring
  - Model Updating
- Working with Other Languages
- Case Study
- Wrapping Up
3. Text Representation
- Vector Space Models
- Basic Vectorization Approaches
  - One-Hot Encoding
  - Bag of Words
  - Bag of N-Grams
  - TF-IDF
- Distributed Representations
  - Word Embeddings
    - Pre-trained word embeddings
    - Training our own embeddings
      - CBOW
      - SkipGram
  - Going Beyond Words
- Distributed Representations Beyond Words and Characters
- Universal Text Representations
- Visualizing Embeddings
- Handcrafted Feature Representations
- Wrapping Up
II. Essentials
4. Text Classification
- Applications
- A Pipeline for Building Text Classification Systems
  - A Simple Classifier Without the Text Classification Pipeline
  - Using Existing Text Classification APIs
- One Pipeline, Many Classifiers
  - Naive Bayes Classifier
  - Logistic Regression
  - Support Vector Machine
- Using Neural Embeddings in Text Classification
  - Word Embeddings
  - Subword Embeddings and fastText
  - Document Embeddings
- Deep Learning for Text Classification
  - CNNs for Text Classification
  - LSTMs for Text Classification
  - Text Classification with Large, Pre-Trained Language Models
- Interpreting Text Classification Models
  - Explaining Classifier Predictions with Lime
- Learning with No or Less Data and Adapting to New Domains
  - No Training Data
  - Less Training Data: Active Learning and Domain Adaptation
- Case Study: Corporate Ticketing
- Practical Advice
- Wrapping Up
5. Information Extraction
- IE Applications
- IE Tasks
- The General Pipeline for IE
- Keyphrase Extraction
  - Implementing KPE
  - Practical Advice
- Named Entity Recognition
  - Building an NER System
  - NER Using an Existing Library
  - NER Using Active Learning
  - Practical Advice
- Named Entity Disambiguation and Linking
  - NEL Using Azure API
- Relationship Extraction
  - Approaches to RE
  - RE with the Watson API
- Other Advanced IE Tasks
  - Temporal Information Extraction
  - Event Extraction
  - Template Filling
- Case Study
- Wrapping Up
6. Chatbots
- Applications
  - A Simple FAQ Bot
- A Taxonomy of Chatbots
  - Goal-Oriented Dialog
  - Chitchats
- A Pipeline for Building Dialog Systems
- Dialog Systems in Detail
  - PizzaStop Chatbot
    - Building our Dialogflow agent
    - Testing our agent
- Deep Dive into Components of a Dialog System
  - Dialog Act Classification
  - Identifying Slots
  - Response Generation
  - Dialog Examples with Code Walkthrough
    - Datasets
    - Dialog act prediction
      - Loading the dataset
      - Models
    - Slot identification
      - Loading the dataset
      - Models
- Other Dialog Pipelines
  - End-to-End Approach
  - Deep Reinforcement Learning for Dialogue Generation
  - Human-in-the-Loop
- Rasa NLU
- A Case Study: Recipe Recommendations
  - Utilizing Existing Frameworks
  - Open-Ended Generative Chatbots
- Wrapping Up
7. Topics in Brief
- Search and Information Retrieval
  - Components of a Search Engine
  - A Typical Enterprise Search Pipeline
  - Setting Up a Search Engine: An Example
  - A Case Study: Book Store Search
- Topic Modeling
  - Training a Topic Model: An Example
  - Whats Next?
- Text Summarization
  - Summarization Use Cases
  - Setting Up a Summarizer: An Example
  - Practical Advice
- Recommender Systems for Textual Data
  - Creating a Book Recommender System: An Example
  - Practical Advice
- Machine Translation
  - Using a Machine Translation API: An Example
  - Practical Advice
- Question-Answering Systems
  - Developing a Custom Question-Answering System
  - Looking for Deeper Answers
- Wrapping Up
III. Applied
8. Social Media
- Applications
- Unique Challenges
- NLP for Social Data
  - Word Cloud
  - Tokenizer for SMTD
  - Trending Topics
  - Understanding Twitter Sentiment
  - Pre-Processing SMTD
    - Removing markup elements
    - Handling non-text data
    - Handling apostrophes
    - Handling emojis
    - Split-joined words
    - Removal of URLs
    - Nonstandard spellings
  - Text Representation for SMTD
  - Customer Support on Social Channels
- Memes and Fake News
  - Identifying Memes
  - Fake News
- Wrapping Up
9. E-Commerce and Retail
- E-Commerce Catalog
  - Review Analysis
  - Product Search
  - Product Recommendations
- Search in E-Commerce
- Building an E-Commerce Catalog
  - Attribute Extraction
    - Direct attribute extraction
    - Indirect attribute extraction
  - Product Categorization and Taxonomy
  - Product Enrichment
  - Product Deduplication and Matching
    - Attribute match
    - Title match
    - Image match
- Review Analysis
  - Sentiment Analysis
  - Aspect-Level Sentiment Analysis
    - Supervised approach
    - Unsupervised approach
  - Connecting Overall Ratings to Aspects
  - Understanding Aspects
- Recommendations for E-Commerce
  - A Case Study: Substitutes and Complements
    - Latent attribute extraction from reviews
    - Product linking
- Wrapping Up
10. Healthcare, Finance, and Law
- Healthcare
  - Health and Medical Records
  - Patient Prioritization and Billing
  - Pharmacovigilance
  - Clinical Decision Support Systems
  - Health Assistants
  - Electronic Health Records
    - HARVEST: Longitudinal report understanding
    - Question answering for health
    - Outcome prediction and best practices
  - Mental Healthcare Monitoring
  - Medical Information Extraction and Analysis
- Finance and Law
  - NLP Applications in Finance
    - Financial sentiment
    - Risk assessments
    - Accounting and auditing
  - NLP and the Legal Landscape
    - Legal entity extraction with LexNLP
- Wrapping Up
IV. Bringing It All Together
11. The End-to-End NLP Process
- Revisiting the NLP Pipeline: Deploying NLP Software
  - An Example Scenario
- Building and Maintaining a Mature System
  - Finding Better Features
  - Iterating Existing Models
  - Code and Model Reproducibility
  - Troubleshooting and Interpretability
  - Monitoring
  - Minimizing Technical Debt
  - Automating Machine Learning
    - auto-sklearn
    - Google Cloud AutoML and other techniques
- The Data Science Process
  - The KDD Process
  - Microsoft Team Data Science Process
- Making AI Succeed at Your Organization
  - Team
  - Right Problem and Right Expectations
  - Data and Timing
  - A Good Process
  - Other Aspects
- Peeking over the Horizon
- Final Words
Index

pokaż cały spis treści »

ISBN Ebooka :: 978-14-920-5400-9, 9781492054009
Data wydania ebooka:: 2020-06-17 Data wydania ebooka często jest dniem wprowadzenia tytułu do sprzedaży i może nie być równoznaczna z datą wydania książki papierowej. Dodatkowe informacje możesz znaleźć w darmowym fragmencie. Jeśli masz wątpliwości skontaktuj się z nami [email protected].
Język publikacji:: angielski
Rozmiar pliku ePub:: 26.2MB
Rozmiar pliku Mobi:: 62.9MB

Produkt nie został jeszcze oceniony pod kątem ułatwień dostępu lub nie podano żadnych informacji o ułatwieniach dostępu lub są one niewystarczające. Prawdopodobnie Wydawca/Dostawca jeszcze nie umożliwił dokonania walidacji produktu lub nie przekazał odpowiednich informacji na temat jego dostępności.