6.4. data management

Subdecks (1)

Cards (42)

  • The six Vs are?
    Volume
    veracity
    value
    Variability
    Velocity
    Variety
  • Volume is the amount of the data
  • Value is the meaning of the data after analysis and how much it meets the predefined goals
  • Variety is where data is from
  • Velocity is the speed at which the data is generated in real time
  • Variability is how consistent the data is when being analysed such as it constantly changing increasing variability
  • Veracity is how truthful the data is and therefore the confidence a user can have it
  • Whenever data is gathered, processed, and analysed; the data must be verified as otherwise the analysis cannot be trusted
  • An organisation must select the reasearch population it wishes to use to gather the required data and info
  • The number of people required to complete research is called the research population
  • If a search is not complete then the data is nt assured as it may be unreliable
  • A research population is chosen off of demographics, if these demographics are unequal then the results are unreliable
  • Data that is gathered must comply with current legislation
  • Data warehousing is a centralised data storage solution of structured data
  • ALl data in a data warehouse is integrated into a pre-defined format
  • Data mining is used to find patterns within large amounts of data
  • All data in a data warehouse is ready to be analysed
  • Data in a warehouse cannot be changed, or edited
  • Data can be added to a warehouse by updating it's sources but cannot be added directly in
  • A data warehouse requires data to be assured
  • Data lakes do not have certain sources and formats and instead will take data from all sources in any data type
  • Data in a data lake is only wrangled and assured before being used and analysed
  • Data in a warehouse or lake can be referred to as big data
  • Data mining is used by organisations to process and analyse raw data
  • Data reporting takes data and provides info
  • Data reporting is generally used to understand the current or past situations (descriptive data)
  • Metadata takes 3 forms:
    Adminastrative metadata
    Descriptive metadata
    Structural metadata
  • Adminastrative metadata provides adminstrative instructions avout a file such as access rights
  • Descriptive metadata helps for accessibility, compatability, and search functions. It stores data such as title, video runtime, and resolution
  • Structural metadata stores data about how the main data should be arranged and structured, the main example of this is a data dictionary
  • If data is accessed it can have data stolen or edited, in either of these case; data may be:
    Biased
    Innacurate
    Unreliable
  • Data lakes are useful for quickly changing data
  • Creation date is an adminstrative metadata