Origins and GDPR

You can read more information here


What information is used to build Origins?

The Origins classification is built from a global file containing the personal and family names of some 527,000,000 adults from around the world. In addition we have access to personal and family name frequencies covering another 529,000,000 adults. These billion adults are resident in 18 different countries.

Using this information we have been able to establish the likely Origins code for some 2,000,000 different family names and some 700,000 personal names.

Who Uses Origins and How is it Used?

Origins is used to profile customers and customer segments, citizens and service users, employees and even suppliers. By profiling customers you can identify which groups are under or over-represented on your customer file. You can find out which groups prefer to use which products, channels and outlets, which ones you are good or poor at retaining and which are responsive to which types of promotion or reward.

Origins is used to code customers. By coding customers you can target campaigns to improve awareness and take up of public services by members of specific minority groups. You can also target products, such as cosmetics, media channels and travel, at audiences for whom they have been especially developed.

Origins is used to classify postcodes. Using a table which identifies the dominant Origins type in each postcode you can identify and map the locations in which individual communities have established themselves right down to street level.

How does Origins work?

In order to code individual customers, Origins makes use of a table which contains information on over 700,000 personal names and over 2,000,000 family names. Each of these names has been examined in such a way as to identify the Origins type to which it is most likely to belong.


This evaluation makes use of a number of criteria including the Origins codes of the surnames held by bearers of each personal name, and vice versa; the geographical concentration of the name both within and between countries; the Mosaic codes in which the names are mostly found ; and the appearance of diagnostic letter sequences.

This evaluation also establishes the confidence with which we can say a particular name belongs to a particular Origins type.

Looking at the codes associated with both the personal name and the family name, and taking into account the confidence level of each, Origins identifies the Origins type to which each customer name is most likely to belong.

What is Origins coverage rate?

Provided you files free of data capture errors, you should be able to code 99.5% of your customer records by Origins type. The residue are either names which the system does not recognise, because they are rare, or ones which the system can not allocate to any particular Origins type.

What is Origins level of accuracy?

The level of accuracy varies from one Origins type to another. Origins achieves accuracy rates in excess of 90% in identifying South Asians and Muslims, and 70% in identifying Black Africans, Greeks, Armenians and people from East and South East Europe. It achieves accuracy rates of 50% with Hispanics. Lower accuracy rates are achieved with people of Nordic or French origin, with Jews and Black Caribbeans.

As would be expected the system is more accurate when coding names to a general categories, such as South Asians or Greeks or Greek Cypriots, than to specific sub-categories, such as Sri Lankans or Greek Cypriots.

How does Origins handle persons of mixed ancestry?

Origins can be used to identify persons whose names come from more than one tradition – for example a person with an English personal name and a Finnish family name.

The confidence score given to each name combination can also be used to select or deselect people who are most likely to be of mixed ancestry. Restricting a communication to names with high confidence scores is an effective way of avoiding communicating with individuals who are least likely to belong to the selected target group.

Profiling using Origins

When Origins is used to profile customer, citizen or employee files it is possible to compare the distribution of records by Origins on your file with the distribution of the population by Origins in the geographical region which you serve or from which your employees are drawn.


For example you can specify as your base comparison any administrative region, local authority district, postcode area, police, education or health area in Great Britain. The distribution of the population by Origins is also available for regions of the USA and other European countries.

Using Origins in different countries

Although Origins is a single application, it has facilities whereby it can be optimised for specific international markets. These international versions code certain names differently in different markets. For example a ‘Roger’, which would be coded as ‘English’ in Britain, would be coded ‘French’ in France. Non GB versions of Origins also allow the mix of names by Origins type to be compared with the Origins mix for the specific market in which the analysis is undertaken.

The product is particularly attractive to international organisations who need a consistent basis for analysing diversity in each of the national markets in which they operate.

Output can be configured for local languages and needs. For example the way in which the Origins categories are best grouped will be different in Australia from in the Netherlands. The system provides complete flexibility over classing.

How is Origins accessed?

Origins types and groups can be appended to customer records using Origins software applications. These applications are licensed to clients by Webber Phillips Ltd or by our partners in North America and in Australia. A PC version is downloadable from the internet and accesses reference files which are updated on a regular basis as names from more countries are introduced to the system.


The licence fee depends upon the version of the application licensed.  A standard version of the application is designed to code names appearing on British or Irish customer lists.  An enhanced version also appends gender and an estimate of life-stage.  Versions of Origins can also be licensed optimised for different overseas national markets.

Origins can also be accessed using a web server; or files can be sent directly to Origins Info for coding.

Some users access Origins at the postcode level, either coding customer by the dominant Origins category in their postcode or using databases showing the mix of Origins types for different levels in the postcode geographic hierarchy.  These databases are updated annually.

Need more details? Contact us

We are here to help. Contact us by phone or email.

© 2019 by OriginsInfo

  • Black LinkedIn Icon

Proudly created with wix.com