Rasa is an open source library for building conversational interfaces, most commonly called “chatbots”. Building a good chatbot is not an easy task and people at Rasa knows it, that’s why their library contains a lot of different components working together.

One of these components is the NLU (Natural Language Understanding) that resulted to be the best choice for a set of projects where I needed to build a model to do Named Entity Recognition. In this post I’d tell you some of the reasons I decided to go with Rasa.

Rasa has been built to create chatbots, so an important concept in this kind of tools is the “intent”. What the NLU tries to do is to distinguish between different “intentions” the user has when her inputs some text.

For example, if the user input is:

The desired intent can be called “greet”.

The desired intent can be called “order”.

As you can see, the intents mechanism works like a classification model where the input is the users text and the predicted intent is the output.

Structure

The way Rasa is built makes it very easy to add examples to train the NLU. You don’t need to be a programmer nor have technical expertise to help, with a good documentation almost everyone could be capable to contribute(*). This allows a cheaper and faster way to move your project forward.

(*) There are exceptions like adding regular expressions or things from external components.

Training

The NLU training is made through several files located in the /data folder:

Let’s see an example:

- intent: check_balance
examples: |
- What's my [credit](account) balance?
- What's the balance on my [credit card account]{"entity":"account","value":"credit"}

In the code above we can see that for the “check_balance” intent we want to extract the account entity type. The words/tokens “credit” and “credit card account” both refer to the same “account” entity type.

- synonym: credit
examples: |
- credit card account
- credit account

This way you can clearly tell the NLU to consider “credit card account” and “credit account” as synonyms to “credit”.

Then we can train the nlu with this command:

rasa train nlu

Use

In order to use this NLU component you need to run the service this way:

rasa run --enable-api -m models/CURRENTMODEL.tar.gz

Once the service is up and running we can make requests:

curl SERVER_IP:SERVER_PORT/model/parse -d '{"text": "SENTENCE"}'

Beautiful, we don’t need to build a custom API, we already have the service ready to use as it is.

Testing

rasa test nlu --config config.yml --nlu data/ --cross-validation

This command takes the “config.yml” file, where you can customize the pipeline with all the components and uses all the training files in the /data folder to evaluate the NLU model and generate the performance results. You don’t need to split the data into training and test sets, that’s done automatically by using the –cross-validation parameter.

The beauty of this is that it automatically generates files with all the metrics (f-score, precision, recall) for each entity type. It’s amazing the amount of time this can save you.

Extensibility

Adding components to the pipeline is really simple and transparent. Each component is represented by an independent Python script so you don’t have integration problems.

For example the Duckling component, an open source library developed by Facebook which allows you to extract time, date, number, duration and other features from text is easily integrated to Rasa by adding the following lines to the config.yml file:- name: "DucklingHTTPExtractor"
url: "http://0.0.0.0:8000"
locale: "en_GB"
dimensions: ["time", "amount-of-money", "duration", "ordinal", "number"]

Conclusion

These are some of the features that made me decide for Rasa for my last named entity recognition related projects. I encourage you to try it for yourself.

 

Leave a Reply

Your email address will not be published. Required fields are marked *