Research Defender

Jibunu has integrated with Research Defender to provide additional data quality services. Research Defender is a respondent fingerprinting and fraud detection service that can be added to a survey. More information can be found at: https://researchdefender.com/.

Research Defender uses multiple techniques to identify and prevent fraudsters. These techniques range from tracking common fraudulent activities (e.g., TOR usage, location spoofing), to behavioral tracking of professional survey takers. In addition, they use a variety of tools to help identify and root out trends.
Research Defender’s Search tool is intended to be used as a loading screen by default. However, it can also be used as an introduction screen where the respondent sees the introduction survey text. The process is invisible to the user and does not interfere with the user experience. Research Defender is compatible with privacy and data protection laws.

Research Defender is not enabled on any survey unless requested. To have Research Defender enabled on a study, notify the Project Manager. Please note that there are additional costs associated with adding Research Defender to a survey. Contact the Project Manager for more details.

By default, all terminates based on Research Defender are disabled. If you would like to terminate based on any of the Research Defender criteria, please notify the project manager.

Search tool options include:

  • Store the data with no other action taken (default)
  • Terminate the respondent based on the result:
    • threat_potential_score: The recommended logic for terminating based on the threat potential score would be anyone with a 29 or higher.
    • duplicate_score: The recommended logic for terminating based on the duplicate score would be anyone with 100.
    • If terminating based on the Research Defender response, Jibunu will also terminate the respondent if the API call is not able to complete for any reason.
  • Additional custom terminate criteria can be added such as using a combination of factors to terminate a respondent.

In addition, to these data flags, Research Defender has a Review tool that can be added to analyze open-end responses. If this is a tool you would like included in your survey, we need a programming note stating that “/Review” should be added to specified open-end questions.

Research Defender Data Fields and Values
Data FlagDescription
Search Tool 
threat_potentialText description of respondent’s threat level (e.g. known fraudster from existing database). External data providers (EDPs) are utilized to help identify this. This value defaults to “Low”. If only one of the EDPs identifies a respondent as risky, they are flagged as “Medium”. If more than one of the EDPs identifies them as high risk, they are flagged as “High”.
threat_potential_scoreNumeric value of respondent’s threat level (e.g. known fraudster from existing database). External data providers (EDPs) are utilized to help identify this. Range of values as follows: Low (1 to 33), Medium (34 to 67) and High (68 to 100).
respondent_riskIdentifies fraudulent / bad actors (TOR Network, Private VPNs, etc.). External data providers (EDPs) are utilized to help identify this. This value defaults to “0”. If more than one of the EDPs identifies them as high risk, we flag them as “1”.
country_codeText code of country in ISO format (e.g. US, PH, IN)
countryText description of country of respondent (e.g. United States, India)
respondent_udThis identifier helps our clients manage the identity of the respondent. Specifically, it allows clients to implement rules that are more granular – for example, using this identifier, clients can allow re-entry into specific surveys because of some client specific business rule
duplicate_potentialCurrently, this value is “low” or “high”. Over time, we will use incoming data to further track behavioral metrics to help clients identify a “medium” which will produce a more nuanced understanding of a suspicious duplicate survey attempt
duplicate_scoreThis is the float/numeric equivalent of the “duplicate_potential” field. Currently, this value is either “0” or “100”. Over time, we will use incoming data to further track behavioral metrics to help clients identify a range which will produce a more nuanced understanding of a suspicious duplicate survey attempt.
duplicate_initial_udIn the case of an existing duplicate, this field would return the original sn_ud to help our clients understand which value was originally a duplicate
flagIdentifies previous access to the survey; 1 if true and 0 if false
failure_reasonReturns a code based on the respondent’s fraudulent status (Duplicate entrant, TOR users etc..). If threat_potential_score >= 30, this parameter will populate with a failure reason as seen below.

These are the following codes and descriptions:
“02” – Duplicate entrant into survey:
The respondent has already attempted this survey, detected via their IP Address or Digital Fingerprint.

“03” – User Emulator:
Emulator software enables fraudsters to emulate/recreate browsers, machines and operating systems as well as spoof other devices. Emulators also allow a respondent to create unique digital fingerprints on a single device.

“04” – VPN usage detected:
Virtual Private Network. There are many valid reasons to use a VPN, so /SEARCH won’t flag respondents only for being on a VPN. However, when a respondent is on a VPN, Public Proxy and is flagged as an Internet Fraudster, that has an extremely high correlation to fraudulent activity and will result in the respondent being flagged.

“05” – TOR network detected:
TOR is a software used to enable anonymous communication in order to conceal a user’s location. Like Proxies and VPNs, TOR use alone does not indicate fraudulent activity, but combined with other flags it can be used to determine the likelihood of fraudulent behavior. This status indicates that this respondent is either currently using a TOR Network or has been linked to a TOR Network

“06” – Public proxy server detected:
A proxy acts as an intermediary between a respondent and a remote connection on the internet. The Public Proxy status indicates the respondent is accessing a survey using a publicly available proxy server. Public Proxies can be accessed by multiple respondents at the same time.
 

“07” – Web proxy service used:
Web Proxies are a subset of a Public Proxy, but can be accessed via the internet with no software or security

“08” – Web crawler usage detected:
This indicates that a respondent has been linked to web crawling activity in the past, which has a high correlation to fraudulent activity, specifically in the ad-tech space.

“09” – Internet fraudster detected:
These are respondents that have been linked to general fraud/abuse on the internet. They are flagged by one or more of the third party vendors we work with who compile databases of fraudsters from various sites on the internet. These fraud prevention vendors compile databases using elements of digital fingerprints tied to historically fraudulent respondents.

“10” – Retail and ad-tech fraudster detected:
Same explanation as above (09), except these are respondents that have been flagged for fraud or abuse specifically in the Retail or Ad-Tech industry.

“11” – IP Address subnet detected:
A subnet is a recreation of the same IP address to spoof against technologies that are looking for a direct match, this can be done easily in order to make an IP Address seem unique.

“12” – Recent Abuse detected: 
The respondent has been recently flagged for fraud/abuse by one of our fraud prevention providers. This is similar to “09” above.
 
“13” – Duplicate Survey Group detected:
This is a deduplication at the Survey Group level (if Survey Group functionality is active in account)

“14” – Navigator Webdriver detected:
/SEARCH has detected any number of automation tools (Selenium, Puppeteer, Playwright) in the respondent’s browser or machine, these tools are synonymous with fraudulent activity in Market Research.

“15” – Developer Tool detected:
/SEARCH has detected a Developer Tool window that is open on the respondent’s browser, this is often used by fraudsters to manipulate API response data and/or their digital fingerprint

“16” – Web RTC IP Address:
WebRTC is an open-source project that allows Research Defender to identify respondents attempting to ‘hide’ their IP Addresses. This enables /SEARCH to uncover and store their actual IP addresses. This failure reason will always be returned for Threat Potential scores = 31 for clients who have 31s considered to be threat level = medium.

“17” – Proxy Detected:
Nefarious T-Mobile Proxy has been detected on the respondent’s session
 
“18” – Maxmind Failure Detected:
Respondent was flagged by Maxmind’s minFraud product
country_mismatchIdentifies country mismatch between respondent and stated country. 1 if true and 0 if false.
sn_udRespondent’s Unique ID
survey_numberJibunu survey number

Review Tool (for Open-Ends) 
language_detectedText description of the language used in the response (e.g. English, French)
language_detected_scoreConfidence interval with which the language is correctly detected
garbage_words and garbage_words_scoreNo longer used.  Language = Unknown is a better barometer for highlighting responses containing “garbage words”.
profanity_checkDetects whether or not profanity is used in response; 1 if true and 0 if false
profanity_check_scoreIf the response contains profanity, 1. Otherwise 0. We don’t recommend using this parameter.
pasted_responseDetects whether the text response was pasted from the user’s clipboard
pasted_response_scoreValue represents the number of words used from the clipboard in the answer (as a %). For example, in an answer of 10 characters and 4 of the characters were pasted from the respondent’s clipboard, the pasted_response_score will be 0.4
engagement_scoreMeasures the length of a response compared to other responses within the same Question ID. If a response is the same length, the engagement_score will be equal to 1. If a response is shorter than the average length within the question ID, the engagement_score will be less than 1. If a response is longer than the average length within the question ID, the engagement_score will be greater than 1.
composite_scoreThis parameter takes into account all the other parameters in order to produce an overall score for the response including Copy/Paste detection, Profanity detection, Engagement Score, Language Detection, Etc.
page_view_timeTime it takes from page load to submitting the text (in milliseconds). Note – this is a stand-alone parameter and does not factor into the composite score.
typed_response_timeTime it takes from the start of entering the text (answer) to submitting the text (in milliseconds). Note – this is a stand-alone parameter and does not factor into the composite score.
similarity_text/REVIEW will use the s_text_length parameter on the API Call which dictates the minimum number of characters needed in the response to perform the check. With an s_text_length=5, the response “cow” would not be checked by Similarity, but “Gorilla” would because it’s more than 5 characters.  This is helpful to sniff out bots or scripts that provide identical responses to Open-Ended questions.  For example, if there was another response to the same question that matched “going on a cruise just before the pandemic”, it would return a “similarity_text”: 1 indicating a duplicate open end

Updated 6/18/24

Amanda Albert has written 10 articles

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>