6.4. data management

    Subdecks (1)

    Cards (42)

    • The six Vs are?
      Volume
      veracity
      value
      Variability
      Velocity
      Variety
    • Volume is the amount of the data
    • Value is the meaning of the data after analysis and how much it meets the predefined goals
    • Variety is where data is from
    • Velocity is the speed at which the data is generated in real time
    • Variability is how consistent the data is when being analysed such as it constantly changing increasing variability
    • Veracity is how truthful the data is and therefore the confidence a user can have it
    • Whenever data is gathered, processed, and analysed; the data must be verified as otherwise the analysis cannot be trusted
    • An organisation must select the reasearch population it wishes to use to gather the required data and info
    • The number of people required to complete research is called the research population
    • If a search is not complete then the data is nt assured as it may be unreliable
    • A research population is chosen off of demographics, if these demographics are unequal then the results are unreliable
    • Data that is gathered must comply with current legislation
    • Data warehousing is a centralised data storage solution of structured data
    • ALl data in a data warehouse is integrated into a pre-defined format
    • Data mining is used to find patterns within large amounts of data
    • All data in a data warehouse is ready to be analysed
    • Data in a warehouse cannot be changed, or edited
    • Data can be added to a warehouse by updating it's sources but cannot be added directly in
    • A data warehouse requires data to be assured
    • Data lakes do not have certain sources and formats and instead will take data from all sources in any data type
    • Data in a data lake is only wrangled and assured before being used and analysed
    • Data in a warehouse or lake can be referred to as big data
    • Data mining is used by organisations to process and analyse raw data
    • Data reporting takes data and provides info
    • Data reporting is generally used to understand the current or past situations (descriptive data)
    • Metadata takes 3 forms:
      Adminastrative metadata
      Descriptive metadata
      Structural metadata
    • Adminastrative metadata provides adminstrative instructions avout a file such as access rights
    • Descriptive metadata helps for accessibility, compatability, and search functions. It stores data such as title, video runtime, and resolution
    • Structural metadata stores data about how the main data should be arranged and structured, the main example of this is a data dictionary
    • If data is accessed it can have data stolen or edited, in either of these case; data may be:
      Biased
      Innacurate
      Unreliable
    • Data lakes are useful for quickly changing data
    • Creation date is an adminstrative metadata
    See similar decks