Home > Data > Data Augmentation
Healthcare Provider Data Augmentation Tool - Hygiene, Enrich, Cleanse Provider Data



Enrich, cleanse and build cross reference table for your Provider data. Provider data augmentation is a sophisticated lookup tool that allows you to check your provider data against NPIDS database and it sends back the NPPES dataset in our easy to use data model to augment into your in-house database. For example, if you have any one of the physician’s information, such as NPI, UPIN, Office phone number, Office fax number, and State License Code, our Provider Data Augmentation lookup tool can automatically validate and categorize your data and obtain NPPES data in NPIDS data structure. NPIDS will provide data filtered from our database to augment, correct, or verify your data. It will be very useful to build any type of cross reference tables including UPIN or Other IDs cross reference tables.

Sample input file can be downloaded by clicking the link Download Data Augmentation Sample Input Data.

Sample output file can be downloaded by clicking the link Download Data Augmentation Sample Output Data.



To submit your file, you should first login to our website. Refer "Input File Format" tab for more details about the format of your input file. Upload Your Input File meeting the "Input File Format" criteria. As soon as the file is uploaded, our system will do preliminary data validations and confirm it immediately. The provider data files will be produced and zipped as one file by our automated process within 30 minutes after the purchase is completed. Sometimes, it might take longer depending on the number of jobs in the queue. You will be notified through an email as soon as the zip file is ready. The zip file will be available for download in your “Available Download” section in "My Account" menu. The download link will be removed after you have successfully downloaded the zip file. To re-download your file, Click contact us for support request.






You can submit up to 25,000 records at a time. One time processing fee is $150.00.

Product Description One Time Processing Fee
Data Augmentation - One Time Processing Fee$150.00



You can submit one CSV file with following data fields. Provide us with a list of the providers, you need complete NPPES data set for. We will provide you the extracted file, including NPI and crosswalk data, in the file format that will integrate most easily into your database.

Sample input file can be downloaded by clicking the link Download Data Augmentation Sample Input Data.

Your file must have the following format for the first two columns.

Column Number Description
#1Provider's valid NPI number or UPIN number or State License number or Phone Number or Fax Number. NPI Number must have 10 digits. When State License number is provided, the second column must have the 2 character State Code. Phone and Fax Numbers should be 10 digit numbers with no special characters. If there is no data in the second column, the first column will be considered as NPI or Telephone or Fax Number or UPIN.
#2This field should be populated only if the first field has State License number. Otherwise, it should be left blank. The US state codes must have only two characters.If it is blank, the first column will be considered as NPI or Telephone or Fax Number or UPIN.
#3-248The columns can be optionally used to send your in-house IDs and other data. You will get these data back in report.csv from us. These fields might be helpful in matching the results and merge them back to your data base. Please don’t include any PHI (Private Health Information)
Your input file:

     • Name must be suffixed with CSV or csv.
     • Must have at least two columns.
     • Can contain maximum 25,000 records.
     • Can have the maximum size of 3 MB.
     • Can have up to 248 columns per record. The columns from #3 to 248 can be optionally used to send your in-house IDs and
        other data. These fields might be helpful in matching the results and merge the data back to your data base. Please don’t
        include any Private Health Information.
     • Can not have one record with 2 columns and another record with 3 or more columns. The file must have all records with
        equal number of columns.
     • Can have mixed records like one record with NPI number, another record might have Telephone number, another record
        might have State License Number and another record might have UPIN.
     • Columns’ must be delimited with comma.
     • Columns’ data must be enclosed with double quotes (“).
     • Will be cleaned up automatically from our system in two weeks after it is processed.


Provider data augmentation tool will process your input file and produce the data files in NPIDS normalized data model. It will produce five files (npis.csv, personnames.csv, orgnames.csv, taxonomies,csv and otherids.csv). These are normalized provider data files and are linked together with NPI # as primary key.


Sample output file can be downloaded by clicking the link Download Data Augmentation Sample Output Data.

Output Provider Data Structure:

     1. Primary Key and Composite Key columns are high lighted in blue color font in diagram/data layout below.
     2. Each record in NPI file (npis.csv) will either have a record in Person Names file (personnames.csv) or in Organization Names
        file (orgnames.csv).
     3. One provider record can have upto 15 records in Taxonomies file (taxonomies.csv). Each record will be identified with sequence
        number from 1 to 15. The combination of NPI number and Sequence number is always unique.
     4. One provider record can have upto 50 records in Other IDs file (otherids.csv). Each record will be identified with sequence
        number from 1 to 50.The combination of NPI number and Sequence number is always unique.

For Detailed Provider Data Structure (Model) of Complete Data, Click your Mouse Over the boxes below and use tab, up and down arrow keys:


           

In addition to the above 5 data files, the following 4 files will also be produced for your use. These nine files will be zipped together in one zip file.


     • report.csv will contain your input records and each record will come with the status (GOOD, DUPLICATE or ERROR).


     • xrefer.csv will give you the cross reference relationship between your input file and our output files. It will have a record lay out
       with first two columns of your (input) file, record category, record count, cross reference NPI list. These fields will be helpful in
       matching the results and merge them back to your data base. Record category gives the info about how our system categorized
       the input record. It will have one of the four categories - NPI, PHONEFAX, UPIN, or STATE LICENSE CODE. The record count
       returned in the output file and the number npis in the NPI list will match.


     • error_log.csv is produced with your input records failed in our categorization and validation process. This file will not
       be present if there are no errors. If the error records are duplicated, only one record will be returned.


     • README.txt is produced with record layout details of the files you get from us.






The provider data in NPIDS data sets are taken directly from monthly updates from the National Plan & Provider Enumeration System (NPPES) from Centers for Medicare and Medicaid Services (CMS), a division of the US Department of Health and Human Services (HHS), an agency of the US federal government. NPIDS is not responsible for the quality and timeliness of NPPES data. NPIDS might add few columns in some of the files to have normalized data and to accommodate derived descriptions of the data, which would help the customers for the practical use of the dataset in their systems and environments.