Look Who’s MeriTalking: Fighting Fraud, Waste, and Abuse With Big Data

By: Mary Tobin

Blogs

Mar 23, 2017 | 9:19 am

(Image: Shutterstock)

MeriTalk recently spoke with Alan Ford, director of Teradata Government Systems. He delved into fraud, waste, and abuse in the government–why it happens and how big data and machine learning can play a role in stopping this $300 billion-a-year problem.

MeriTalk: Why do we see fraud, waste, and abuse continue to rise–especially related to Medicare and Medicaid–when reducing them has been a sustained priority?

Alan Ford: Medicare and Medicaid, specifically, are classified as high-risk programs by the Federal government as they have a greater vulnerability to fraud, waste, and abuse mismanagement. There are a couple of reasons for this. First, traditionally, the penalties in this area have been low relative to other crimes and they are nonviolent crimes, so deterrence is low. Second, the system operates in a pay-first, check-later fashion, making it more susceptible to abuse because barriers to entry are low and the perpetrators are often long gone once the fraud is discovered. Third, Medicare and Medicaid are vital programs, so any changes to make fraud detection easier have to be done without interrupting the delivery of lawful recipients’ services.

Medicare and Medicaid comprise just one area of fraud, waste, and abuse. There are many more similar use cases including Federal student loans, defense contractors, disaster relief requests, and mortgages. Many of the same issues that put Medicare and Medicaid at risk apply to these other areas as well.

MeriTalk: Where has the greatest progress been made in stopping or reducing government fraud, waste, and abuse? Are there specific programs that you can cite as best practices?

AF: The more data that is made available for analysis, the better the chances that agencies can generate adequate levels of information to drive the detection of fraud, waste, and abuse. Data sharing across Federal and state barriers enables new insight into fraudulent activity, which is difficult to achieve when data are kept siloed. Data sharing is a major opportunity for agencies to become more effective.

As an example, the Health Care Fraud Prevention and Enforcement Action Team (HEAT), a joint task force among Health and Human Services (HHS), the Department of Justice (DoJ), and the Office of Inspector General (OIG), was created to share data and information. Since its inception in 2009, it has detected and collected more than $7 billion of fraudulent monies and convicted almost 2,000 different defendants–very effective work.

Another best practice program has originated from the Centers for Medicare & Medicaid Services’ (CMS) Integrated Data Repository (IDR), one of the largest and most successful fraud and health care information repositories in the Federal government. It is based on a high-volume data warehouse, including information such as Medicare beneficiary data, provider data, contract information, and risk scores. The combination of these data sources into one integrated environment empowers organizations such as HEAT to use the data to generate new insight. Teradata has helped CMS run and operate IDR for more than 10 years with extraordinary results.

MeriTalk: What role can and must big data and analytics play in reducing the incidence of fraud, waste, and abuse–including preventing improper payments and determining accurate eligibility for and enrollment in specific Federal programs?

AF: Big data analytics are important because they create so much valuable insight from available data–structured and unstructured. We need analytical techniques that are sophisticated, but easy enough to use so analysts performing investigations can access and combine these different data types.

For example, the Social Security Administration is using disability claim information and looking at medical taxonomies and expected diagnoses to re-create decision-making processes to assist in identifying fraudulent claims. They could benefit from techniques and systems to transform raw, unstructured claim data into meaningful and useful information.

MeriTalk: Much of the focus has been on looking backward to identify instances that have already occurred. What types of technologies are helping agencies to identify and prevent fraud and abuse before it happens?

AF: Agencies need to act predictively rather than reactively. Many organizations are focused on determining what happened in the past and why. Predictive tools are available today that give agencies this ability.

Agencies need to combine today’s predictive analytic technologies with near-real time data ingest to determine what is happening now or is likely to happen in the near future. Tools that allow for individuals to provision new data sets without significant IT intervention and combine unstructured and traditional data are required to move agencies into being predictive, agile, and proactive. Again, the technology exists now.

MeriTalk: How is machine learning factoring into fraud, waste, and abuse identification and prevention initiatives? What’s next for machine learning in this area?

AF: Wherever the capabilities exist to integrate sensor data and Internet of Things (IoT) data for analytical work, there is an opportunity to leverage machine learning as well. Machine learning is great at sifting through enormous data sets and looking for outliers and insight we cannot get elsewhere as quickly or effectively.

For example, we need to apply machine learning to activities like modern aircraft maintenance. A fighter jet may have thousands of sensors collecting data in microsecond intervals resulting in a terabyte or two of data from just one flight. Machine algorithms can sift through huge data sets collected across an entire fleet and flag relevant outliers for a human to investigate. Applying advanced analytics to this data can identify operational trends and circumstances that can predict part and equipment failure before it happens. Engaging in this “condition-based maintenance” can prevent inefficient and wasteful use of repair and inventory resources and head off catastrophic failures before they occur.

MeriTalk: What hurdles remain in achieving real-time identification of suspect transactions or behaviors? How can agencies best address them, especially in this time of doing less with more?

AF: There are three big hurdles. First is reducing the lag between an event and detection of the event. To remove the hurdle, analysts need better, quicker access to important data.

Second is opening the world of advanced analytics to the people who are doing the investigations–fraud analysts. They need access to advanced analytical tools with pre-existing complex algorithms that enable them to plug parameters into them, rather than having to know how to write the algorithms themselves. These tools exist, but agencies have to identify them and get them into the hands of their analysts.

Third is enabling the ability to reach across various data platforms to combine data sets with greater ease. We need solutions that enable users to access and combine data regardless of where they reside and the disparate platforms involved. Again, these tools exist, but agencies need to procure and employ them.

MeriTalk: How are Teradata and its solutions helping Federal agencies attack the fraud, waste, and abuse challenge?

AF: Teradata has been helping government agency customers for more than 20 years with traditional techniques like data warehousing to combine data across multiple subject areas into a single integrated data model, enabling greater insight. We have health care solutions across the Federal and state government, such as CMS and its IDR. Several states use data warehousing tax solutions to identify tax fraud, mistakes in filings, and more. And finally, we support an enterprise data warehouse at the Postal System to help the organization run its operations more efficiently.

We also have leading-edge solutions involving advanced analytics of nontraditional data. For example, USTRANSCOM has a data warehouse that helps with materiel logistics, but also employs advanced analytics for optimizing cargo transport across available transport vehicles in the military. USTRANSCOM uses its system to marry cargo needing transport with partially loaded flights or even empty training missions to reduce the overall number of flights and optimize the efficiency of planned trips.

The Air Force and Navy use advanced Teradata analytics for pre-emptive maintenance in identifying precursor conditions of equipment failures. For example, if sensor data identifies excessive vibration in a turbine, it could indicate an imminent bearing failure. Proactive repair of the vibration or proactive replacement of affected parts helps to prevent a future broader failure.

Today and in the near future, Teradata increasingly plays a major IT and analytics consulting role, leveraging valuable intellectual property and innovative solutions from thousands of customer engagements. The technology landscape has become so complex that few organizations can operate at peak effectiveness without expert consulting guidance.

Cookie	Duration	Description
AWSALBCORS	7 days	Amazon Web Services set this cookie for load balancing.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	New Relic uses this cookie to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie stores and identifies a user's unique session ID to manage user sessions on the website. The cookie is a session cookie and will be deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pxhd	1 year	PerimeterX sets this cookie for server-side bot detection, which helps identify malicious bots on the site.

Cookie	Duration	Description
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.

Cookie	Duration	Description
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
_gat	1 minute	Google Universal Analytics sets this cookie to restrain request rate and thus limit data collection on high-traffic sites.

Cookie	Duration	Description
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.
pardot	past	The pardot cookie is set while the visitor is logged in as a Pardot user. The cookie indicates an active session and is not used for tracking.
UID	1 year 1 month 4 days	Scorecard Research sets this cookie for browser behaviour research.
vuid	1 year 1 month 4 days	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gcl_au	3 months	Google Tag Manager sets the cookie to experiment advertisement efficiency of websites using their services.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	Google sets this cookie under the DoubleClick domain, tracks the number of times users see an advert, measures the campaign's success, and calculates its revenue. This cookie can only be read from the domain they are currently on and will not track any data while they are browsing other sites.

Cookie	Duration	Description
anj	3 months	AppNexus sets the anj cookie that contains data stating whether a cookie ID is synced with partners.
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
GoogleAdServingTest	session	Google sets this cookie to determine what ads have been shown to the website visitor.
IDE	1 year 24 days	Google DoubleClick IDE cookies store information about how the user uses the website to present them with relevant ads according to the user profile.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
muc_ads	1 year 1 month 4 days	Twitter sets this cookie to collect user behaviour and interaction data to optimize the website.
personalization_id	1 year 1 month 4 days	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
uuid2	3 months	The uuid2 cookie is set by AppNexus and records information that helps differentiate between devices and browsers. This information is used to pick out ads delivered by the platform and assess the ad performance and its attribute payment.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
_mkto_trk	1 year 1 month 4 days	This cookie, provided by Marketo, has information (such as a unique user ID) that is used to track the user's site usage. The cookies set by Marketo are readable only by Marketo.
__gpi	1 year 24 days	Google Ads Service uses this cookie to collect information about from multiple websites for retargeting ads.

Archives