Book (Practical) Information science Practical Python-based Bayesian analysis and topic model / Iwao Fujino

※Please note that product information is not in full comprehensive meaning because of the machine translation.
Japanese title: 単行本(実用) 情報科学 実践 Pythonによるベイズ分析とトピックモデル / 藤野巖
Out of stock
Item number: BO4407252
Released date: 03 Apr 2024
著: 藤野巖

Product description ※Please note that product information is not in full comprehensive meaning because of the machine translation.

Information Science
[Introduction to Content]
[Features of Books]
From both theory and practice, this book explains Bayesian analysis, and therefore topic models.
The topic model was proposed as a natural language processing technique, and is a probabilistic model that can discover latent and deep topics from a large amount of document data.
In recent years, its power has been applied not only to document data but also to image data and trajectory data analysis, and it has become a basic technology supporting artificial intelligence (AI) along with deep learning.
In this book, we have tried to keep the theoretical basics firmly in mind so that students can learn practically while programming as much as possible.
We have also designed a lower staircase so that readers can easily climb it.
[About each chapter]
Chapter 1 explains the knowledge of probability and probability distribution necessary for learning this book and the realization of the program.
Chapter 2 : Review the basic methods of data analysis as a comparison.
Chapter 3 : Explain the basic concept of Bayesian analysis.
At the same time, this paper introduces how to use PyMC library which is used for program realization of Bayesian analysis.
Chapter 4 : As a comparison position, the basic technique of the conventional document data analysis is reviewed.
Chapter 5 : Configure Unigram models to analyze document data.
We also show the program implementation with the PyMC library.
Chapter 6 : Construct a mixed Unigram model incorporating the concepts of the topic.
An example of a document analysis program using the mixed Unigram model is also shown.
Chapter 7 : The Mixed Unigram Model is further developed into a topic model. In addition, an example of a document analysis program using a topic model is shown.
Chapter 8 : Explains how to use the modules of the topic model in the Scikit-learn library.
It uses it to extract topics from English document data in the 20 News Groups dataset.
Chapter 9 : Explains how to use Gensim, a library dedicated to the topic model.
It uses it to extract topics from Wikipedia's Japanese document data.
Chapter 10 : Extend the topic model to build the author topic model.
It then uses the Gensim library to extract topics from Japanese posts collected from Twitter.
Chapter 11 : Applying topic models to image data sets.
It uses the Gensim library to extract topics in small cells from a dataset called Caltech101.
Chapter 12 : Applying topic models to trajectory data sets.
The Gensim library is used to extract topics from ship AIS data such as navigation routes (courses).
[Message from author]
Practice is a shortcut to technology acquisition.
As you read this book, be sure to practice it repeatedly.
It addition, it would be more effective if we could work to solve problems that are assumed to be applied in practical situations.
We hope that readers of this report will improve their advanced data analysis skills and play an active role in the field.
[Contents]
☆ Due to pre-publication information, there may be some changes.
1. Probability and Probability distributions
1.1 Probability and Probability distributions
1.1.1 Probability
1.1.2 Probability distributions
1.2 Conditional probability and simultaneous probability
1.2.1 Conditional probability
1.2.2 Simultaneous probability
1.3 Multiplicative Theorem and Bayes Theorem
1.4 Various Probability Distributions
1.4.1 Bernoulli Distribution
1.4.2 Binomial Distribution
1.4.3 Category distribution
1.4.5 Exponential distribution
1.4.6 Beta distribution
1.4.7 Gamma distribution
1.4.8 Dirichlet How to Achieve Cosine Similarity Using the 4.4.2 Scikit-learn Library
Example Program to Calculate a Cosine Similarity
Exercise Questions
5. Unigram Model
Stochastic model for generating a few notes
Ideographic model representation
Program for generating a set of notes based on a Unigram Model
Idegraphical model representation
Program for generating a set of notes based on a Unigram Model
Program for the frequency of words in a set of notes
Program for generating a set of notes based on a Unigram Model
Program for the frequency of words in a set of notes
Parameter estimation of a Unigram Model Creating Datasets
Creating Author Topic Models with Gear PyMC
Author Topic Models with Gear Gensim
Author Topic Models with Gear Gensim
Author Topic Model Classes in Gear Gensim
Preparing Datasets from Twitter Data
Applying Author Topic Models to Twitter Data Sets
Applying Author Topic Models to Twitter Data Sets
Exercise Questions
11. Topic Extraction from Image Data Sets
Prior knowledge of Applying Topic Models to Image Data
Creating Images from Image Data Sets
Applying Topic Models to Image Data Sets
1.5 1.6 2.1 2.2 2.2.1 2.2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.4.1 3.4.2 3.5 3.5.1 3.5.2 4.1 4.2 4.3 4.3.1 4.3.2 4.3.3 4.3.4 4.4 4.4.1 4.4.3 5.1 5.2 5.3 5.3.1 5.3.2 5.3.3 5.4 5.4.1 5.4.2 5.4.3 6.1 6.2 6.3 6.4 7.1 7.2 7.3 7.4 7.4.1 7.4.2 8.1 8.2 8.3 8.4 8.5 8.6 9.1 9.2 9.3 9.4 10.1 10.2 10.2.1 10.2.2 1.3.1 10.3 10.3.1 10.3.2 10.3.3 11.1 11.1.1 11.1.2 11.1.3 11.2 11.3 1.3.2 11.4 12.1 12.1.1 12.1.2 12.1.3 12.2 12.3 12.4 Arviz