You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+29-12
Original file line number
Diff line number
Diff line change
@@ -40,11 +40,28 @@ The fixed structure must be technology-agnostic. The first fields of teh fixed s
40
40
*`Email: [Option[String]]` point of contact between consumers and maintainers of the Data Product. It could be the owner or a distribution list, but must be reliable and responsive.
41
41
*`OwnerGroup [String]`: LDAP user/group that is owning the data product.
42
42
*`DevGroup [String]`: LDAP user/group that is in charge to develop and maintain the data product.
43
-
*`InformationSLA: [Option[String]]` describes what SLA the Data Product team is providing to answer additional information requests about the Data Product itself.
43
+
*`SupportSLA: [Option[String]]` describes what SLA the Data Product team is providing when some support is needed.
44
+
*`SupportHours: [Option[String]]` define when the suport is available. Ex During working days from 9 to 18
45
+
*`ResponseTime: [Option[String]]` define the amount of time needed to take care of an incoming feature
46
+
*`ResolutionTime: [Option[String]]` define the amount of time needed to fix the date
47
+
*`InformationTime: [Option[String]]` define the amount of time needed to answer clarification questions.
44
48
*`Status: [Option[String]]` this is an enum representing the status of this version of the Data Product. Allowed values are: `[Draft|Published|Retired]`. This is a metadata that communicates the overall status of the Data Product but is not reflected to the actual deployment status.
45
49
*`Maturity: [Option[String]]` this is an enum to let the consumer understand if it is a tactical solution or not. It is really useful during migration from Data Warehouse or Data Lake. Allowed values are: `[Tactical|Strategic]`.
46
50
*`Billing: [Option[Yaml]]` this is a free form key-value area where is possible to put information useful for resource tagging and billing.
47
51
*`Tags: [Array[Yaml]]` Tag labels at DP level ( please refer to [OpenMetadata documentation](https://docs.open-metadata.org/v1.0.0/main-concepts/metadata-standard/schemas/type/taglabel)).
52
+
*`BusinessConcepts: [Array[Yaml]]` Link with Business Concepts coming from the Business Ontology/Glossary at DP level ( please refer to [OpenMetadata documentation](https://docs.open-metadata.org/v1.0.0/main-concepts/metadata-standard/schemas/type/taglabel)). Source field must be "Glossary" and the href must link to the Uri of the external glossary or ontology
53
+
*`SecurityInfo: [Yaml]` Security attributes provide guidance to understand who can access this Data Product and which authorizations are needed
54
+
*`Confidentiality: [Option[String]]` This field indicates the level of confidentiality assigned to the data product. It defines how sensitive the data is and determines the access controls and protections that need to be in place. Common examples might include "Public," "Internal," "Confidential," or "Secret."
55
+
*`Visibility: [Option[String]]` This field defines the scope of visibility for the data product. It dictates which users, teams, or systems can view or access the data. For example, it could specify whether the data is visible to only specific internal departments
56
+
*`GDPR: [Option[String]]` This field indicates whether the data product is subject to the General Data Protection Regulation (GDPR), and if so, what specific measures or classifications apply. Yes or No
57
+
*`BusinessInfo: [Yaml]`
58
+
*`ValueProposition: [Option[String]]`: Describe the valu eproposition of the data product from a business standpoint
59
+
*`ValueGeneration: [Option[String]]`: Define what kind of value this DP will generate. It could be a Foundation DP ( tipically a source aligned one), otherwise can be "Operation Monitoring" collecting information about the company processes and providing decision support, then "Revenue Generation" for those DP that can be directly monetized.
60
+
*`StakeholderRoles: Array[String]`: List of stakeholders involved, interested and supporting this data product
61
+
*`PricingType: [Option[String]]`: It could be Subscription or Pay as You Consume
62
+
*`PricingInfo: [Yaml]`: Free structure field to describe the pricing structure of the data product
63
+
*`StrategicInitiatives: Array[String]` Provides the linking between the Data Product and the strategic initiatives of the company, for example is possible to link Company OKR
64
+
*`TargetConsumption: [Array[String]]` Define which are the ideal consumption cases for this data product. It could be analytics, reporting, online application, etc.
48
65
*`Specific: [Yaml]` this is a custom section where we can put all the information strictly related to a specific execution environment. It can also refer to an additional file. At this level we also embed all the information to provision the general infrastructure (resource groups, networking, etc.) needed for a specific Data Product. For example if a company decides to create a ResourceGroup for each data product and have a subscription reference for each domain and environment, it will be specified at this level. Also, it is recommended to put general security here, Azure Policy or IAM policies, VPC/Vnet, Subnet. This will be filled merging data defined at common level with values defined specifically for the selected environment.
49
66
50
67
The **unique identifier** of a Data Product is the concatenation of Domain, Name and Version. So we will refer to the `DP_UK` as a URN which ends in the following way: `$DPDomain:$DPName:$DPMajorVersion`.
@@ -81,18 +98,18 @@ Constraints:
81
98
*`IntervalOfChange: [Option[String]]` how often changes in the data are reflected.
82
99
*`Timeliness: [Option[String]]` the skew between the time that a business fact occurs and when it becomes visibile in the data.
83
100
*`UpTime: [Option[String]]` the percentage of port availability.
84
-
*`TermsAndConditions: [Option[String]]` If the data is usable only in specific environments.
85
101
*`Endpoint: [Option[URL]]` this is the API endpoint that self-describe the output port and provide insightful information at runtime about the physical location of the data, the protocol must be used, etc.
86
-
*`biTempBusinessTs: [Option[String]]` name of the field representing the business timestamp, as per the "bi-temporality" definition; it should match with a field in the related `Schema`
87
-
*`biTempWriteTs: [Option[String]]` name of the field representing the technical (write) timestamp, as per the "bi-temporality" definition; it should match with a field in the related `Schema`
88
-
*`DataSharingAgreement: [Yaml]` This part is covering usage, privacy, purpose, limitations and is independent by the data contract.
89
-
*`Purpose: [Option[String]]` what is the goal of this data set.
90
-
*`Billing: [Option[String]]` how a consumer will be charged back when it consumes this output port.
91
-
*`Security: [Option[String]]` additional information related to security aspects, like restrictions, masking, sensibile information and privacy.
92
-
*`IntendedUsage: [Option[String]]` any other information needed by the consumer in order to effectively consume the data, it could be related to technical stuff (e.g. extract no more than one year of data for good performances ) or to business domains (e.g. this data is only useful in the marketing domains).
93
-
*`Limitations: [Option[String]]` If any limitation is present it must be made super clear to the consumers.
94
-
*`LifeCycle: [Option[String]]` Describe how the data will be historicized and how and when it will be deleted.
95
-
*`Confidentiality: [Option[String]]` Describe what a consumer should do to keep the information confidential, how to process and store it. Permission to share or report it.
102
+
*`DataSharingAgreement: [Yaml]` This part is covering usage, privacy, purpose, limitations and is independent by the data contract.
103
+
*`TermsAndConditions: [Option[String]]` If the data is usable only in specific environments.
104
+
*`Purpose: [Option[String]]` what is the goal of this data set.
105
+
*`Billing: [Option[String]]` how a consumer will be charged back when it consumes this output port.
106
+
*`Security: [Option[String]]` additional information related to security aspects, like restrictions, masking, sensibile information and privacy.
107
+
*`IntendedUsage: [Option[String]]` any other information needed by the consumer in order to effectively consume the data, it could be related to technical stuff (e.g. extract no more than one year of data for good performances ) or to business domains (e.g. this data is only useful in the marketing domains).
108
+
*`Limitations: [Option[String]]` If any limitation is present it must be made super clear to the consumers.
109
+
*`LifeCycle: [Option[String]]` Describe how the data will be historicized and how and when it will be deleted.
110
+
*`Confidentiality: [Option[String]]` Describe what a consumer should do to keep the information confidential, how to process and store it. Permission to share or report it.
111
+
*`biTempBusinessTs: [Option[String]]` name of the field representing the business timestamp, as per the "bi-temporality" definition; it should match with a field in the related `Schema`
112
+
*`biTempWriteTs: [Option[String]]` name of the field representing the technical (write) timestamp, as per the "bi-temporality" definition; it should match with a field in the related `Schema`
96
113
*`Tags: [Array[Yaml]]` Tag labels at OutputPort level, here we can have security classification for example (please refer to [OpenMetadata documentation](https://docs.open-metadata.org/v1.0.0/main-concepts/metadata-standard/schemas/type/taglabel)).
97
114
*`SampleData: [Option[Yaml]]` provides a sample data of your Output Port (please refer to [OpenMetadata specification](https://docs.open-metadata.org/v1.0.0/main-concepts/metadata-standard/schemas/entity/data/table#properties)).
98
115
*`SemanticLinking: [Option[Yaml]]` here we can express semantic relationships between this output port and other outputports (also coming from other domains and data products). For example, we could say that column "customerId" of our SQL Output Port references the column "id" of the SQL Output Port of the "Customer" Data Product.
purpose: this output port want to provide a rich set of profitability KPIs related to the customer
46
-
billing: 5$ for each full scan
47
-
security: In order to consume this output port an additional security check with compliance must be done
48
-
intendedUsage: the dataset is huge so it is recommended to extract maximum 1 year of data and to use these KPIs in the marketing or sales domain, but not for customer care
49
-
limitations: is not possible to use this data without a compliance check
50
-
lifeCycle: the maximum retention is 10 years, and eviction is happening on the first of january
51
-
confidentiality: if you want to store this data somewhere else, PII columns must be masked
73
+
dataSharingAgreements:
74
+
termsAndConditions: only usable in development environment
75
+
purpose: this output port want to provide a rich set of profitability KPIs related to the customer
76
+
billing: 5$ for each full scan
77
+
security: In order to consume this output port an additional security check with compliance must be done
78
+
intendedUsage: the dataset is huge so it is recommended to extract maximum 1 year of data and to use these KPIs in the marketing or sales domain, but not for customer care
79
+
limitations: is not possible to use this data without a compliance check
80
+
lifeCycle: the maximum retention is 10 years, and eviction is happening on the first of january
81
+
confidentiality: if you want to store this data somewhere else, PII columns must be masked
52
82
tags:
53
83
- tagFQN: experimental
54
84
source: Tag
@@ -59,6 +89,7 @@ components:
59
89
labelType: Manual
60
90
state: Confirmed
61
91
sampleData: {}
92
+
sampleQuery: select * from dp.table
62
93
semanticLinking: {}
63
94
specific:
64
95
directory: history
@@ -127,6 +158,11 @@ components:
127
158
source: Tag
128
159
labelType: Manual
129
160
state: Confirmed
161
+
businessTerms:
162
+
- tagFQN: BusinessAddress
163
+
source: Glossary
164
+
labelType: Manual
165
+
state: Confirmed
130
166
- name: first_hire_date
131
167
dataType: date
132
168
description: the date of his/her first hire in mybank. No matter is a temporary or permanent contract
0 commit comments