Identify addresses and their parts in text

Powerful transformation of text into structured data: what are the entities mentioned in text, what do we know about them.
Identify addresses and their parts in text
GetAddresses
input: string[] Documents, enum DocumentFormat
output: Entities[] AddressesResult

Takes as an input a list of Documents and their DocumentFormat : text or URL. Extracts a collection of addresses from the document contents.
The title of an address entity, is the concatenation of the values of address subentities found in text. Subentities are parts of the address, with a type assigned to them. The subentity types are: Country, State, Locality, Subburb, Street address, Postal delivery, Post code, Level and Unit.

So far, only New Zealand and Australian addresses are supported (more to be added in near future).

See also Taxonomy extraction.

See also Entity extraction.

Sample code in C#:

PingarAPIRequest request = new PingarAPIRequest();
request.AppID = "your app id";
request.AppKey = "your app key";
request.EntityExtraction = new EntityExtractionRequest();
request.EntityExtraction.Documents = new string[] { "document text" };
request.EntityExtraction.DocumentsFormat = DocumentFormat.Text;
request.Language = Language.EN;

PingarAPIServiceSoapClient pingarAPI = new PingarAPIServiceSoapClient();
PingarAPIResponse response = pingarAPI.GetAddresses(request);
int count = 0;
if (response.Error == null)
{
    foreach (Entities document in response.EntityExtraction.AddressesResult)
    {
        Console.WriteLine("Address Entities For Document " + count);
        foreach (Entity entity in document.entity)
        {
            Console.WriteLine(entity.Title);
            if (entity.SubEntities != null)
            {
                foreach (SubEntity subentity in entity.SubEntities)
                {
                    Console.WriteLine("\t" + subentity.Type + ": " +
                    subentity.Title);
                }
           }
        }
        count++;
    }
}

 
VIEW DEMO OF ENTITY EXTRACTION COMPONENTS
 

Explore Pingar


Share Points CIO Apache Solr BizSpark