4.4.2 Beyond sampling

Cards (8)

  • Sampling was a product of a time when obtaining and processing data was expensive, both in terms of time and money
  • Modern computer technology means that the cost of acquiring, storing and processing data has fallen dramatically
  • Rather than sampling a small subset of the data, it is now possible and affordable to acquire data on the off-chance it will be useful; and then either examine extremely large samples, or indeed process all the data
  • Big data analyses huge pools of data to find new connections that would have been missed by randomly sampling a tiny fraction of all data
  • Big data

    Analysing huge pools of data to find new connections that would have been missed by randomly sampling a tiny fraction of all data
  • Fraud accounts for a tiny fraction of all transactions, so it is highly unlikely a sample of transactions taken from the many millions of transactions made every day would include a single case of fraud
  • By examining all transactions for unusual patterns of activity, potentially fraudulent transactions can be identified for further investigation
  • Fraud detection

    • The financial company Xoom specialises in remittances: money sent by immigrant workers to their home countries
    • This is a business which has traditionally been vulnerable to fraud and money laundering
    • Xoom's automated systems scrutinise every one of the company's transactions
    • In 2011, they identified a string of payments originating in New Jersey made on Discover credit cards
    • None of the transactions were themselves unusual, but the pattern created by the times of the transactions, the sums transferred and their recipients were sufficient to arouse suspicion
    • Xoom's fraud detection team were alerted and found the transactions involved criminal activity; the payments were blocked, no customers lost money and the criminals were identified