Skip to content

Word Normalizer

Correcting the words in sentences is needed to produce excellent and correct sentences to make it easier for readers to understand the meaning of the sentence. Improvements are made to words that experience writing errors, spelling, even the use of non-standard words which is not found in the Kamus Besar Bahasa Indonesia V (https://kbbi.kemdikbud.go.id/) such as slang or foreign language. This word repair can be done using the Word Normalizer module.

Input data can be structured text, such as news, until the text that is written unstructured, such as text from social media.

These are examples which can be repaired with Word Normalizer, such as

  • Typo

e.g.: “makn” should be “makan” in the sentence “Aku mau makn.”

  • Misspelling

e.g.: “Bu” should be “Bu,” in the sentence “Bu saya mau makan.”

  • Slang Word

e.g: “bet” should be “banget” in the sentence “Soal itu menurut gw susah bet sih”

Illustration

Word Normalizer

Request Method

POST

Request URL

1
https://api.prosa.ai/v2/text/normalizer

Request Header

Key Data Type Description Value
Content-Type string Media type of the body sent to the API. Only Support 'application/json' application/json
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Body

The request body accepts the following parameter(s) in JSON format.

Parameter Data Type Description Auto Required
text string text to be processed True

Example

Sample Request Body (JSON)

1
2
3
{
  "text": "di bandung sdh ada ditaman, di sekolah2."
}

Sample Response Body (JSON)

1
2
3
{
    "normalized_text": "di bandung sudah ada di taman, di sekolah-sekolah."
}

Additional Feature: Whitelist

Whitelist is a feature for users to create a list of strings that must not be processed or changed by the word normalizer API. This allows users to customize their word normalizer to be tailored to their company’s needs.

The whitelist is tied to each API key, which means that modifications to the whitelist only apply to related users' application.

There are five end-points that users can use to maintain their whitelist, i.e. Create, Get, Delete, Patch Add, and Patch Remove Whitelist Data.

1. Create new whitelist

Note: This request can only be used when the API key used currently has no existing whitelist (either because this is the first time the whitelist is created for that API key or because the previous existing whitelist has been deleted). If you want to modify (add or remove) the existing whitelist, use Patch Add or Patch Remove (explained in the next sections).

Request Method

POST

Request URL

1
https://api.prosa.ai/v1/text-api-data

Request Header

Key Data Type Description Value
Content-Type string Media type of the body sent to the API. Only Support 'application/json' application/json
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Body

The request body accepts the following parameter(s) in JSON format.

Parameter Data Type Description Auto Required
target string target API URL (value must be “/normals”) True
data dict whitelist data to add (see example in Example Body) True

Example

Sample Request Body (JSON)

1
2
3
4
5
6
7
8
{
    "target": "/normals",
    "data": {
        "white_list": [
            "jln","HCI","SDB"
        ]
    }
}

Sample Response Body (JSON)

1
2
3
{
    "message": "api data has been saved"
}

2. Get whitelist data

Request Method

GET

Request URL

1
https://api.prosa.ai/v1/text-api-data

Request Header

Key Data Type Description Value
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Params

Key Data Type Description Value
target string target API URL /normals

Example

Sample Response Body (JSON)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
    "target": "/normals",
    "api_key": "[YOUR_API_KEY]",
    "data": {
        "white_list": [
            "jln",
            "HCI",
            "SDB"
        ]
    }
}

3. Delete whitelist data

Note: Be careful! This request will delete all existing whitelist data that has been added previously.

Request Method

DELETE

Request URL

1
https://api.prosa.ai/v1/text-api-data

Request Header

Key Data Type Description Value
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Params

Key Data Type Description Value
target string target API URL /normals

Example

Sample Response Body (JSON)

1
2
3
{
    "message": "api data has been deleted"
}

4. Patch add whitelist data

Note: This request allows users to add one or more texts data to the existing whitelist

Request Method

PATCH

Request URL

1
https://api.prosa.ai/v1/text-api-data/add

Request Header

Key Data Type Description Value
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Body

The request body accepts the following parameter(s) in JSON format.

Parameter Data Type Description Auto Required
target string target API URL (value must be “/normals”) True
data dict whitelist data to add (see example in Example Body) True

Example

Sample Request Body (JSON)

1
2
3
4
5
6
7
8
{
    "target": "/normals",
    "data": {
        "white_list": [
            "kpn", “bca”
        ]
    }
}

Sample Response Body (JSON)

1
2
3
{
    "message": "api data has been updated"
}

5. Patch remove whitelist data

Note: This request allows users to remove one or more texts data from the existing whitelist.

Request Method

PATCH

Request URL

1
https://api.prosa.ai/v1/text-api-data/remove

Request Header

Key Data Type Description Value
x-api-key string API Key Acquired from Prosa API Console [YOUR_API_KEY]

Request Body

The request body accepts the following parameter(s) in JSON format.

Parameter Data Type Description Auto Required
target string target API URL (value must be “/normals”) True
data dict whitelist data to add (see example in Example Body) True

Example

Sample Request Body (JSON)

1
2
3
4
5
6
7
8
{
    "target": "/normals",
    "data": {
        "white_list": [
            "kpn", "umgr"
        ]
    }
}

Sample Response Body (JSON)

1
2
3
{
    "message": "api data has been updated"
}

Free trial

Are you interested in our API? Click the button below and get your free trial now.

try now

Version History

Below is the version history of our Word Normalizer API.

Version F1 Test Data
1.0 54.63% 52,082 tokens

Questions?

We do our best to make this documentation clear and user friendly, but if you have unanswered questions, please send email to support@prosa.ai.