2. What is PII
PII: Personally Identifiable Information:
“information that can be used to distinguish or trace an individual’s identity,
either alone or when combined with other personal or identifying information
that is linked or linkable to a specific individual. (...) Rather, it requires a case-by-
case assessment of the specific risk that an individual can be identified. ”
Source: OMB M-10-23 (Guidance for Agency Use of Third-Party Website and Applications)
https://www.whitehouse.gov/sites/whitehouse.gov/files/omb/memoranda/2010/m10-23.pdf
3. Examples of PII based on NIST
Source: NIST Publication: Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)
https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-122.pdf
Name AddressPhone number Email
Credit card Asset
Information
Biometric
Information
Religion
Passport
4. Categorizing PII information
PII, Sensitive, or regulated information can be different for each organization and
deployment. Each organization needs to assess such information and classify it
based on its impact level. Certain PII information can be more or less sensitive
depending on the organization and its activity.
NIST classifies such information into 3 potential impact levels :
● Low
● Moderate
● High
Source: Standards for Security Categorization of Federal Information and Information Systems
https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.199.pdf
5. Categorizing PII Information By Impact level
NIST suggests 6 factors to determine PII confidentiality impact levels:
1. Identifiability
2. Quantity of PII
3. Data field sensitivity
4. Context of use
5. Obligation to protect the confidentiality
6. Access and location of PII
Source: Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)
https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-122.pdf
6. Categorizing PII information by Impact level
1. Identifiability: “how easy it is to identify a person. SSN uniquely and directly identifies an
individual, whereas a telephone area code identifies a set of people. “
1. Quantity of PII: “ how many individuals can be identified from the PII”
1. Data field sensitivity: “individual‘s SSN or financial account number is generally more sensitive
than an individual‘s phone number or ZIP code. Organizations should also evaluate the sensitivity
of the PII data fields when combined.”
1. Context of use: “people who subscribe to a general-interest newsletter produced by the
organization, and the second list is people who work undercover in law enforcement. “
1. Obligations to protect confidentiality: IRS is “subject to specific legal obligations to protect
certain types of PII”
1. Access to and location of PII: The more often you have access to information the higher risk.
Location of PII can also increase the risks
Source: NIST 800-122 Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)
7. Metadata: PII/ Sensitive Data
Metadata does contain PII and sensitive content.
IP or MAC addresses are common examples of PII / Sensitive information but
should not be limited to that.
Metadata individually in a vast majority of cases provides just a trace. Combining
metadata can allow distinguishing a specific individual. When such information is
enriched with auditing metadata it becomes easier to find targets and makes the
combination more sensitive.
Big data leveraged by AI, makes metadata even,
more vulnerable as it allows to be mixed
with multiple sources and crunched and processed
by an AI engine.
8. Why is BI critical for PII/ Sensitive content?
✓ BI centralizes information from multiple sources via ETL jobs
✓ BI allows to mix & transform information from multiple sources
✓ BI is a portal to share content with governed users
✓ BI is a portal to share information with non governed users
✓ BI can be used as an ETL (unfortunately)
➡ Always ensure you have full data integrity from source to the
recipient
11. 6 Steps To Deal With PII/ Sensitive Information
In Your Analytics
● Step #1: Finding such information
● Step #2: TAG & catalog
● Step #3: Documenting, reporting and monitoring
● Step #4: Security, account / user recertification & PIA
● Step #5: Control content being shared
● Step #6: Archiving and deleting
12. ● Get input from GRC team
● Access to external data catalogs
● Exchange with business
By experience, Business Users tend to have the best understanding of what contains PII.
They need to be educated to classify such information by their GRC team. A good way to
start classifying PII information is to understand:
Direct Identifiers: Information that can directly identify an individual (also called directly
identifying variables or direct Identifying data) such as Name, Address, or SSN.
Quasi Identifiers: Information that can be aggregated to identify a person.
(also called Indirect Identifiers or Indirect Identify variables) Such as Birthday, Zip.
Source: NISTIR 8053 De-Identification of Personal Information
https://nvlpubs.nist.gov/nistpubs/ir/2015/NIST.IR.8053.pdf
Step #1 Finding Such Information
13. Step #2 TAG & Catalog
In SAP Business Objects you have 2 levels of granularity that can be leveraged:
● Document
● Object-level
Object-level is preferable, as it offers the highest granularity. However, this is
applicable when you are leveraging Universe’s. In the event you are not leveraging
Universe’s, you will need to tag at the Document Level.
Keep proper nomenclature of such information to classify it based on its impact
level.
When tagging such information, the life cycle needs to be considered as such
information needs to be traced and it’s end of life needs to be planned.
16. Step #3 Documenting, Reporting, And Monitoring.
Reporting and monitoring on PII & Sensitive information require access to full
metadata:
● CMS
● Auditor
● FileStore
Ensure you document Data lineage, understand the connections, and ETL’s.
Additionally, it will require BI-on-BI, lineage, impact analysis, and reporting
capabilities.
Besides tracking PII information in your analytics solution, make sure you track the
activity on PII information. What Business Objects users are viewing instances with
PII, what activity they have on such information. This can be traced via a username
or IP.
Source: Metadata for Analytics and BI solutions
https://360suite.io/white-paper/metadata-for-bi-and-analytics-solutions/
19. Step #4 Security, Account / User recertification & PIA
Analyze who has permissions of users having access to the security and compare it
to the actual needs determined by policymakers.
● Document complete security & double inheritance
● Compare security over time
● Track actions and non-actions from users, documents, and applications
● Monitor decommissioned users and contents
● Manage quick removal of decommissioned users
● Perform reporting proof on decommissioned users
● Provide Safe Disaster and Recovery with decommissioned user tracking
● View publication and bursting schedules
21. 3 Scenarios:
1. Governed (Content shared within the BI platform)
2. Governed data dump. (Governed users, exporting governed documents to
ungoverned formats). => De-Identify PII
3. Ungoverned (content shared outside BI platform) xls, pdf, CSV, etc =>De-
Identify PII
For Ungoverned content a few good practices:
● De-Identify information
● Secure by password
● Tag Metadata
● Watermark
● Encryption
STEP #5 Control Content Being Shared
24. STEP #6 Archiving And Deleting
PII at the end of the life cycle has a need:
● Archiving
● Deletion
● De-Identification (in some case)
To be noted Sensitive Content may have a need to be re-tagged / re-classified.
Keep in mind PII is composed of data and metadata.
When archiving, consider using a widely readable format and leverage a WORM
technology (Write Once Read Many).
Source: Write Once Read Many
https://en.wikipedia.org/wiki/Write_once_read_many
26. Take Away
➔ PII & Sensitive content are easy to deal with.
➔ Make sure you exchange with your Business.
➔ Understand the BI information flow.
➔ Whatever decision you make, ensure it takes into consideration the lifecycle.
➔ Automate the lifecycle management.
Handling PII & Sensitive information at EDW level is not sufficient,
and Analytics is the most visible part.
27. Sources
● NIST, Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), April
2010, https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-122.pdf
● NIST, De-Identification of Personal Information, October 2015,
https://nvlpubs.nist.gov/nistpubs/ir/2015/NIST.IR.8053.pdf
● 360Suite, Account And User Recertification for SAP BusinessObjects, January 2018,
https://360suite.io/blog/business-objects-user-account-recertification/
● 360Suite, SAP BusinessObjects Security: The 5 Key Questions You Need To Answer, April 2020,
https://360suite.io/blog/business-objects-security/
● NIST, Standards for Security Categorization of Federal Information and Information Systems,
April 2004, https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.199.pdf
● Wikipedia, Write Once Read Many, https://en.wikipedia.org/wiki/Write_once_read_many
● 360Suite, Metadata For BI And Analytics Solutions, https://360suite.io/white-paper/metadata-
for-bi-and-analytics-solutions/
PI: Personal Information: (o) (1) “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.
section 1798.140 of CCPA
NPI. Non Public Personal Information=> PII Not available to the Public
Name, maiden name, mother‘s maiden name, or an alias.
Personal identification number, such as social security number (SSN), passport number, driver‘s license number, taxpayer identification number, patient identification number, bank account number or credit card number.
Address information, such as street address or email address.
Asset information, such as Internet Protocol (IP) or Media Access Control (MAC) address or other host-specific persistent static identifier that consistently links to a particular person or small, well-defined group of people.
Telephone numbers, including mobile, business, and personal numbers.
Personal characteristics, including photographic image (especially a face or another distinguishing characteristic), x-rays, fingerprints, or other biometric image or template data (e.g., retina scan, voice signature, facial geometry)
Information identifying personally owned property, such as vehicle registration number or title number and related information.
Information about an individual that is linked or linkable to one of the above (e.g., date of birth, place of birth, race, religion, weight, activities, geographical indicators, employment information, medical information, education information, financial information).
How do you handle PII
We follow a process to handle PII
We handle it case by case
No specific handling
We don’t have PII
Not sure how to start
PIA (Privacy Impact Assessment) , if your Analytics contain PII it should go through a PIA
“structured reviews of how information is handled: (i) to ensure handling conforms to applicable legal, regulatory, and policy requirements, (ii) to determine the risks and effects of collecting, maintaining and disseminating information in identifiable form52 in an electronic information system, and (iii) to identify and evaluate protections and alternative processes for handling information to mitigate potential privacy risks” based on OMB OMB M-03-22, Guidance for Implementing the Privacy Provisions of the E-Government Act of 2002,
The E-Government Act of 2002 requires Federal agencies to conduct PIAs when: Developing or procuring information technology that collects, maintains, or disseminates information that is in an identifiable form; or Initiating a new collection of information that— – Will be collected, maintained, or disseminated using information technology; and – Includes any information in an identifiable form permitting the physical or online contacting of a specific individual, if identical questions have been posed to, or identical reporting requirements imposed on, 10 or more persons, other than agencies, instrumentalities, or employees of the Federal Government. NIST 800-122
de-identification: “general term for any process of removing the association between a set of identifying data and the data subject.” [p. 3] anonymization: “process that removes the association between the identifying dataset and the data subject.” [p. 2] pseudonymization: “particular type of anonymization that both removes the association with a data subject and adds an association between a particular set of characteristics relating to the data subject and one or more pseudonyms.”1
Anonymization is another subcategory of de-identification. Unlike pseudonymization, it does not provide a means by which the information may be linked to the same person across multiple data records or information systems. Hence reidentification of anonymized data is not possible.” [p. 6
WORM means that only an act of willful (physical) destruction will remove information from disks before the set retention date. Many systems archive information, and this is a very important first step; however, in the highly regulated financial industry, ESI needs to be stored in this secure format.