Sl. No EOI Page
Number
EOI Clause Number Existing Clause Query/Suggestion SBI Response
1 11 Annexure A - Eligibility Criteria, Point
No 2
The vendor should have existing
Next-Gen Data Warehouse
the solution as mentioned in the
EOI
Please clarify, Whether this criterion is for OEM or
Bidders as SI.
Please refer to Corrigendum.
2 11 Annexure A - Eligibility Criteria, Point
No 3
The solution should have been
implemented in at least 2 large
scale organizations.
Documents to be submitted :
Two references with the following
details for each reference to
be provided:
1. Name of the
Organization
2. Name of the Official
3. Contact number of
Official
4. E-mail Id of Official
Please clarify: if the proposed solution as been
implemented in the large organization. We as Bidder,
can we use the OEM experience and participate the In
the RFP. This will help us to qualify and also take the
OEMs expertise for successful implementation of the
Project.
Team Computers is CMMI L3 IT service and solution
company with More than 500 Cr turnover with a
dedicated focus on BI and Analytics experience. we
have Rs 40 Cr + yearly Revenue form Analytics
Business out of 500 Cr. The certificate from the CA
could be provided to suffice the criteria. We have the
capabilities of Building the modern Dataware housing
Solution.
Please refer to Corrigendum.
3 39 Annex. D ( Sr.No 4) Reporting : 100+Busted Reports
300+ Interactive Reports
Query: What is the total number of Users who will
consume business intelligence output on the business
side
Please refer to Hardware specification, point # 35 - The web portal of
Business Intelligence tool should support at-least 25000 concurrent users,
scalable up to 75000 in next 5 years, accessing various reports generated
4 20 Critical Functional Requirement -
Monitoring Dashboard # 1
Real Time data flow in dashboard Query: What is the SLA on Real time reporting This information is not required at this stage of EOI.
5 32 Critical Functional Requirement -
Business Intelligence (Point 32)
Ability to handle and summarize huge volumes of
data.
Query : How do you want vendors to support this
capability? Does lab report / outputs work well ? Does
this need to be demonstrated?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
6 31 Critical Functional Requirement -
Business Intelligence (Point 21)
Mobile Version Query : Do you require access to BI reports and
dashboards via native iOS & native Android mobile
application?
Yes
7 14 7 Existing ETL Jobs to be Fine Tuned What kinds of ELT tools are being used currently? If
multiple tools are used, among current 3000+ jobs,
how many for different tools approximately?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
8 14 13 Trigger mechanisms in identifying any structural
changes at source
Can you help clarify what the structural changes are? DDL changes
9 14 14 The ingestion tools should be able to perform a
change data capture on source systems of this
nature with run time decompression functionality
If the change data capture is in use now, what's the
vendor product? How many jobs are running on
different CDC tools?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
10 15 16 scrapping 4000-5000 logs daily having log size of ~
2 TB each scalable up to 10000 logs.
How many source applications are involved
approximately?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
11 15 17 Solution should be able to handle DDL change
without manual reorg/runstat.
Could you please provide use case about DDL change? This is an industry standard concept
12 15 18 A job scheduler, along with process management
controls that provide things like runtime
monitoring and error alerting, handling, and
logging.
Is there any job scheduler used currently? If yes,
what's it? If more that one, please list as well.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
13 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot
How many downstream departments might be
engaged for DB snapshot approximately? Are they
branch IT or business department?
This information is not required at this stage of EOI.
14 17 12 ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Could you help elaborate some use case of AI/ML
featuer? Do you know any ETL tool with AI/ML
featuer?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
15 18 16 Reports on job status and success / failure /
retrigger should be sent to concerned stakeholders
on a continuous basis
Through which channels reports are sent out, email,
SMS, or some real-time monitoring dashboard? Please
help clarify.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
16 18 19 Workflow management tool(s) should have
connectors / pluggable interfaces to already
existing / in-use proprietary software available with
the Bank. These could be (and not restricted to)
data repositories, reporting tools, data analysis
tools and generic interfaces for data transfer.
What are these exact products of vendors of 'data
repositories, reporting tools, data analysis tools and
generic interfaces for data transfer'?
These are industry standard capabilities
17 18 3 Data migration from existing archival solution to
new one.
What's the current existing archival solution? Archived
to tape lib ? Please help clarify.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI. Data
Archival is not on tapes and current archival solution is similar to production
environment and accessible to end users.
18 19 6 Migration of monitoring dashboard data points. What is current dashboard product being used now?
It's an enterprise-level dashborad and will continue to
use, isn't it?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI. Bank
will take a final call at appropriate time.
19 19 8 Migration of Data Governance, Data Lineage and
Data Quality rules and policies
Is there any automation tool used for governance,
lineage, quality etc? What are they?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
20 19 9 history of version control, existing tape backup What does it mean of 'version control'? It's historical
version of programme / storedproc or a version
control tool?
This is an industry standard terminology
21 19 9 history of version control, existing tape backup Does it mean 'tape backup' migrates to data lake
alike? Will the historical tape backup be part of the
migration scope or leave the historical as it is, just take
care of these nowards?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
22 19 12 Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years
Could you please share the software product stack and
proprietary or open hardware specifics of existing
architecture?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
23 19 3 The NEXT-GEN DW – DR solution needs to be set
up at a remote location at Hyderabad.
What's the distance between PROD site and DR site?
What's the network bandwidth between?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
Responses to queries raised by Bidders in REQUEST FOR EXPRESSION OF INTEREST (EOI) FOR “NEXT-GEN DATA WAREHOUSE SOLUTION”. Request for EOI No.: SBI/GITC/Data Warehouse/2018/2019/34 Dated: 23.03.2019
24 20 3 The architecture of NEXT-GEN DW should enable
interaction between public cloud and designated
edge servers alone.
Could you please elaborate with some detailed use
case?
All the components of Next-Gen DW should not directly connect to public
cloud. A secure designated edge servers to be proposed by Bidder for any
data transfer between Next-Gen DW and public cloud.
25 21 16 Back Dated Data changes needs to be updated on
portal
Could you please clarify with some case of Back Dated
Data change?
Monitoring dashboard should have capability to showcase back-dated data
changes
26 22 3 Management of referential integrity Could you please clarify with some case of referential
integrity?
This is an industry standard terminology
27 23 7 Recommend Enrichment — Enhancing the value of
internally held data by appending related attributes
from external sources
Could you please give an exmaple? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
28 23 10 Identity resolution - Identity resolution is the
process of linking various records and is the main
engine for record de-duplication, which can enable
some aspects of data cleansing.
Could you please give an exmaple? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
29 24 1 Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups.
Are you looking for a standalone AIM specific for this
NEXT-GEN DW solution? Or there is already some
centralized AIM at enterprise-level, this new-
introduced AIM of this solution is required to integrate
or sync with this enterprise-level one? Please help
clarify.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
30 25 1 Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Does it mean a new regulatory reporting
framework/application is requested to replace the
existing one? Or just keep this regulatory reporting
framework/application but migrate the datastore to
NEXT-GEN DW? What is the current framework, from
which product vendor?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
31 28 5 Self-service portal to extract the data on their own
(Should support Data Democratization)
Is there any approval procedure, evaluation of
sharability, or masking/transformation prior to
extraction self-service and any data destory procedure
afterwards? Who are the main requestor and waht's
the frequency of request? Please help clarify.
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
32 28 1 Implementing end to end analytics use-cases as
mandated by the Bank
Could you list all the analytics use-case or some typical
case?
Please refer Annexure E
33 28 2 Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS).
Please advise the exact version of IBM SPSS Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
34 30 1 Capability to connect to various data sources Is there any BI tool being used at this moment? If yes,
what is it and the version?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
35 39 1 Database appliance what is the product of the existing applicance? E.g.
Teradata, Exadata?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
36 39 3 Scrapping of Jobs from Source What kind of scrapping tools are used currently? In
this solution, is there any preference to reuse these
tools?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
37 39 3 Scrapping of daily 5TB + logs What kind of logs here? Are they database logs? If yes,
what are all these database products, e.g. Oracle, DB2,
SQLServer, MySQL etc?
Database Logs. Bidder proposed solution should have capabilitties to scrape
from any kind of database products.
38 39 4 Reporting Any reporting tool is used? E.g. Tableau etc. Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
39 39 6 Hardware Monitoring What is the Monitoring product and version? Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
40 39 6 Server memory and compute monitoring of total
80+ production servers
What type of these production servers, including
applicance or just commondity x86 servers? What
typical spec are they? How many years have they been
used?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
41 39 7 User database of 30000+ officials How many different types of these users? How many
data analysts, which can write and perform ad hoc SQL
query?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
42 39 8 Job Scheduling What is the current scheduling tool being used
currently?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
43 39 18 10TB Oracle database What version is used? Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
44 48 SCAPM IBM Smart Cloud What kind of applications are running on that? What's
the interaction with NEXT-GEN DW solution?
Please refer to Corrigendum.
45 40-43 Jan-32 Functional Use Cases 1) Are these used cases based on existing solutions
with SBI(eg. Risk, Online Fraud
Prediction/management, AML etc.) OR are such
solutions to be proposed by the vendor as part of the
Next Gen Warehouse initiative? In that case, will the
bank provide additional information to size/scope
these solutions?
2) If these are existing solutions which will leverage
the Next Gen warehouse, then will it be the bidders
responsibility to integrate these solutions to the
warehouse? If so, can the bank please share the
details of all these solutions that need to be
integrated.
1. These are sample use cases build for execution on Next Gen DWH platform
over the period of time for Analytical studies. Bank at its own discretion will
implement new models/use cases on this setup in future.
2. This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
46 11 Eligibility Criteria, Clause 2 Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Does the vendor need to be OEM for all the solution
components or the solution stack may be a composite
of other OEM/ Third party solutions ?
Is a consortium allowed for the complete solution,
services and implementation? OR, is SBI expecting a
single point bidder and implementer?
Please refer to Corrigendum.
47 32 Hardware Specifications,1 2 The Vendor is required to supply, install, test,
commission, monitor, manage and maintain the IT
System along with operating system and other
peripherals with one-year warranty and AMC for 4
years from the date of delivery at data centers
advised by the B
Is the bidder allowed to have alliances and
subcontracting for the maintenance?
Yes, such engagements will be guided by certain set of rules which Bank will
publish at later stage.
48 13 Data Ingestion, 1 Capable of ingesting data from any source system
in automated manner currently implemented in the
Bank, or any future standard source systems that
the Bank will decide to use with high throughput
and low latency. Vendor to propose performance
benchmarking
Will the details of all categories of ingest jobs (sources
types/ frequencies/ loads/ etc) be made available - at
least at time of RFP? ""Any future standard sources
Systems"" - can you provide few examples of such
future systems?
This information is not required at this stage of EOI.
49 15 Data Storage, 1 Vendor should propose effective number of data
storage layers in NEXT-GEN DW between data
ingestion and data consumption.
The number of data storage layers also depends on
the used cases for the implemented data e.g. based on
latency, access levels , grain etc . Is it adequate if the
solution capability in this regard explained with a
couple of examples, while there may be other
constructs as well ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
50 17 Data Processing, 14 Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying
The given metric is about only the counts of parallel
jobs. However processor, memory and compute
resources are also determined by the size and
complexity of the jobs, and the time window to
conclude. We would require additional details in the
count metric provided for sizing/ architecture
recommendations. Can the bank share details of the
complexity mix & matrix and the definitions thereof.
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
51 18 Migration from existing Setup, 1 Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
Requesting the bank to share the existing architecture
details which will be required to address this point in
the EOI
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
52 32 Hardware Specifications, 2 Software and Solution proposed by vendors should
be compatible with all types of Hardware.
"All types of Hardware" is very generic. Software
components for such specialized job as mentioned in
the EOI, would need specific kinds of hardware. Could
you pls explain what is the purpose behind asking for
this requirement?
Are you looking at open source as your first choice for
both hardware & software?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
53 3 Introduction Please note, the objective of this Request for EOI is
to identify all possible solution (s) for the scope of
work defined in this document.
Please clarify if there will be any shortlisting of
vendors or solutions based on this EOI evaluation.
This information is not required at this stage of EOI.
54 11 Annexure A -2 Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Please clarify if only the OEM's having the solution
capabilities as required can participate in this EOI or
the OEM's can also participate with their System
Integration Partners.
Please refer to Corrigendum.
55 40 Annexure E Customer Analytics- Up-sell Is the same to be done by making analytical model for
the same or the bank is open for the pin pointed
solution for the same lilke Analytical CRM
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
56 13 Annexure B Data Ingestion What are the current systems from where the data
needs to be ingested
This information is not required at this stage of EOI.
57 17 Annexure B Data Processing Framework Point 1 - For the data
to be accessible and consumable by businesses /
downstream applications, the NEXT-GEN DW
should have robust, highly efficient and parallel
execution of data transformation jobs.
What are the downstream applications referred herein
This information is not required at this stage of EOI.
58 23 Annexure B Data Quality - Vendor should propose end-to-end
solution for Data Quality Management starting
from data origin till the data consumption. These
tool (s) to be used for addressing various aspects of
the data quality problem mentioned below on SBI
data set during data ingestion, data processing or
data consumption as advised by Bank on case by
case basis
Is there any DQ tool being used currently
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
59 25 Annexure B Regulatory Reporting - Vendor should follow the
RBI guideline in developing the solution with which
it will be easier for the Bank to migrate to the
element-based data reporting envisaged by the RBI
Is the element on a transaction level or entity level
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
60 28 Annexure B Data Science platform with AI/ML Capability -
Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS). Migration of
such models to new solution
Which models are referred to herein
This information is not required at this stage of EOI.
61 29 Annexure B Data Science platform with AI/ML Capability -
Vendor to provide solution / tool (s) for below
scope of activities on SBI data sets;
- Benchmarking
- Predictive & Prescriptive Analytics
- Social Media Analytics
- Web Analytics
- Geolocation Analysis
- Ad-Hoc Analysis
- Trend Indicators
- Profit Analysis
- In-Memory Analysis
- Statistic Analytics
- Data Mining"
Please calrify if the bank is using any tool for
Geolocation currently
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
62 31 Annexure B Business Intelligence Tools - Mobile version: BI
tools should be able to differentiate between
viewing BI applications on a web browser on a
mobile device versus a mobile BI application
a) Please clarify if there is there a requirement to
view dashboards on mobile b) Please
clarify if there is a requirement to view any other data
point also on mobile device
Yes
63 42 Annexure E - Point 20 Organized Fraud/ Collusion Detection - Fraud/AML
Practice - Analysis and identification of factors
causing Fraud/AML practice
a) Please clarify if the bank requires capability to
detect burst out frauds in corporate asset products,
retail asset products and liability products?
b) Please clarify if the bank requires capability to
identify the suspicious linkages based on demographic
and transaction to identify the collusion?
c) Please clarify if the bank requires capability to ready
to analytical methodologies to reduce false positives
in transaction monitoring in AML and Fraud areas
d) Please clarify if the bank requires capability of
building networks based on demographic and
transaction linkages?
e) Please clarify if the bank requires capability to
identify new patterns / new modus operendi in AML /
Fraud areas?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
64 42 Annexure E - Point 21 Opportunistic Fraud Detection - Identification of
Fraud and its prevention - Identify and predict
potential fraud across the Bank (I e. Cheque fraud,
Remittance fraud, Card fraud, Online fraud, etc)
Please clarify if the bank requires capability to build
and refine the fraud detection models at regular
intervals to ensure the model learns based on the
latest collateral / fraud tagged data?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
65 42 Annexure E- Point 22 AML detection and alert management -
Identification of Money Laundering activities and
its prevention - Identification of money laundering
activities based on account transaction and
behavior
a) Please clarify if the bank requires capability of
identifying the shell company accounts, money mules
which are used as intermediaries for layering?
b) Please clarify if the bank requires capability of an
integrated modelling and deployment framework to
develop and test models for financial crimes.
c) Please clarify if the bank requires capability of
developing alert scoring models to identify the high
risk alerts?
d) Please clarify if the bank requires capability related
to analytical suppression of AML alerts to ensure to
reduce the false positives
e) Please clarify if the bank requires capability related
to customer, account, product, transaction type etc.
scoring to contribute to alert scoring for AML.
f) Please clarify if the bank requires capability to
identify complex layering techniques involving large
number of accounts?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
66 43 Annexure E- Point 32 Transaction Fraud Prediction - Identification of
Suspicious transactions on Real Time Basis for all
digital channels - Identification of Suspicious
transactions on Real Time Basis for all digital
channels
a) Please clarify if the bank requires capability related
to developing models for scoring the digital
transactions based on supervised and un-supervised
techniques?
b) Please clarify if the bank envisages deploying model
in any existing fraud transaction monitoring tool ?
c) Please clarify if the bank intends to procure the real
time fraud prevention tool as part of this
evaluation/initiative?
d) Does the bank expect to include an intelligent
investigation tool to assess the results of new rules /
models
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
67 40 Annexure E- Point 2 Customers may have the same or a similar product
but might be very different in profitability and
marketing efforts can leverage this information to
sell a higher margin or higher value product to a
profitable customer. An upsell model evaluates this
insight at a customer level
Currently, Is there a way of tracking customer
profitability which can aid in deriving insights for
upsell and cross sell. If yes, How is this information
tracked
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
68 40 Annexure E- Point 3 Various campaigns are run to target existing
customers for cross-selling other products. Budgets
allocated for running such campaigns are limited.
Thus, models to improve response rate to these
cross-sell campaigns are significant. Higher product
penetration per customer also discourages
customer attrition, improves customer loyalty
Are there any specific channels that are being used
currently for cross-sell campaigns and do we have the
information pertaining to the responses from the
specific channels
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
69 40 Annexure E- Point 10 Identify internal & external factors affecting lead
conversion and forecast the factors for upcoming
quarters
Currently, does the bank have a mechanism to track
lead conversions and the data needed for identifying
the forecast factors
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
70 19 Disaster Recovery : 3 The NEXT-GEN DW – DR solution needs to be set
up at a remote location at Hyderabad.
Please Clarify if Backup solution is required in DR Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
71 20 Data Archival & Backup : 2 Data Archival solution should not be visible to end
user, but Archived data should be available for all
end users. For end user it should be a single view
with Data Federation/Virtualization Layer
What is the Directory services used by SBI. Is it Active
Directory or any other directory services.
Active Directory
72 20 Data Archival & Backup :5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
What is the current backup solution? Is the migration
of old backup in scope? If Yes, what is the size and no.
of Tapes(LTO Versions)
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
73 20 Data Archival & Backup :7 Archival and Backup setup must support automated
Data Reconciliation whenever movement from
Current/Live happens
Archival & Backup are both different solutions
meeting different set of requirement. Both
Compliments each other. e.g. Archival will reduce the
Backup footprint. There won't be any Automated
Data Reconcillation. Request Bank to remove this
point.
No change in standard clause of EOI
74 20 Cloud Integration and Migration :1 NEXT-GEN DW should be able to consume data
from external cloud-based infrastructures.
Please suggest of there are any specific cloud
providers. Any Specific Cloud providers?
This information is not required at this stage of EOI.
75 6 Downstream Data Consumption -28 Dedicated high-performance department wise
sandboxes allocated to end users for R&D
Sanbox can be spinned up/down based on the
need(instead of having a dedicated one) , this will
ensure that storage and compute are not locked and
optimally used.Request you to consider modification
of this requirement.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
76 13 Annex B Real time Data ingestion with spontaneous
reconciliation
Pl explain 'spontaneous reconciliation' with example Spontaneous means real time reconciliation over here.
77 14 Critical Functional Requirement -
Data Ingestion #8
Existing ETL Jobs to be Fine Tuned Is the rationalization of ETL jobs also expected Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
78 15 16 Proposed solution should be able to scrap
encrypted log, capture Metadata changes at source
level completely, scrapping 4000-5000 logs daily
having log size of ~ 2 TB each scalable up to 10000
logs. Proposed solution should be capable of
scrapping logs generated by any type of Database.
E.g. Oracle Database, IBM DB2 Database etc.
Please reconfirm if the log size is 2TB, because that
would imply that the total size of daily logs is 10000 *
2TB = 20000TB
Please refer to Corrigendum.
79 15 16 Proposed solution should be able to scrap
encrypted log, capture Metadata changes at source
level completely, scrapping 4000-5000 logs daily
having log size of ~ 2 TB each scalable up to 10000
logs. Proposed solution should be capable of
scrapping logs generated by any type of Database.
E.g. Oracle Database, IBM DB2 Database etc.
Our suggestion is that it would be appropriate for the
bank to specify the anticipated source types to be
supported for log scraping rather than 'any type of
database'
No change in standard clause of EOI
80 15 17 Solution should be able to handle DDL change
without manual reorg/runstat. It should handle
network fluctuations and hindrances.
Request you to please elaborate on this requirement
and specify which solution component or solution
capability this refers to, DDL changes to which
database, and the nature of hindrances being referred
to
DDL changes in source database.
Nature of hindrances includes all possible failures
81 15 Critical Functional Requirement -
Data Ingestion #8
Solution should be able to handle DDL change
without manual reorg/runstat. It should handle
network fluctuations and hindrances.
Is it DDL change of source system or the DDL change in
Datawarehouse? Pl provide example of Network
Malfucntions and hindrances
DDL changes in source database.
Nature of hindrances includes all possible failures
82 15 3 A multi-temperature data management solution to
be proposed by vendor where data that is
frequently accessed on fast storage—hot
data—compared to less-frequently accessed data
stored on slightly slower storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should also be capable
automated storage tiering and seamless data
transfer between hot, warm and cold storage. Data
residing in any of these storage areas must be
seamlessly mixed / merged according to
requirements without impacting performance.
1. We believe the bank would want multi-temperature
data management to be within the storage system as
well as across the storage system. Please confirm.
2. Please confirm what will be the data movement
trigger criteria. E.g. Age of file and/or access frequency
of file etc.
3. Please confirm if the understanding of tiers vis a vis
storage technology is correct: -
Tier 1 - Hot Data - Flash Storage
Tier 2 - Warm Data - Spinning Disks (NLSAS) based
storage
Tier 3 - Cold Daya - Tape Storage
4. Please confirm what could be the likely
performance requirement in terms of latency to be
provisioned from storage system?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
83 16 4 Storage replication (e.g. RAID) should be
automatically managed by the platform.
We believe storage replication (RAID) technology is
desired for protecting data within the system from
disk failures. Also, please confirm if the RAID feature
should be highly resilent with minimum dual disk
protection.
Please confirm if there is a needs to replicate data
using storage based replication meholodoly across
across another site (e.g. DR site) or application level
replication is also acceptable
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
84 16 5 Tool should have capability to store/swap data in
memory, disk and distributed storage areas
depending on the age of the data determined by its
usages through user queries
Please confirm if distributed storage area requires a
distributed file system based scale out storage which
can grow in performance and capacity linearly as per
the system demands and still provides a single global
namespace to DWH application.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
85 16 6 User should be able to work on DB even while
backup is in progress. They should be able to run
statistics and reorganize their tables. Any
background process including backup must not
hamper performance of user queries.
Please share the expected backup size and backup
window in the expected system
Bidders to propose Backup solution(s) in view of the sizing given in Annexure
G
86 16 8 Storage should support data compression. It should
be possible to perform both fast compression and
efficient compression based on data processing
needs.
Compression is a CPU taxing job. Majority of the large
scale storage systems are designed to provide a single
compression algorithm which is intelligent enough to
identify which data can be compressed efficiently and
fast.
For data which cannot be compressed (e.g images or
videos etc.) or achieves lower compression ratio is
skipped so as to preserve the CPU cycles and thereby
the performance of the overall system.
It is requested to drop this clause from the storage
specifications
No change in standard clause of EOI
87 16 10 Ensuring real time health checks, monitoring and
alerting about data storage / utilization of storage /
failure handling of storage components. Actionable
dashboard must be available to designated users to
monitor health checks and tool should
automatically issues alerts to users.
Some errors in storage systems go unnoticed, without
being detected by the disk firmware or the host
operating system; these errors are known as silent
data corruption.
Please confirm if the storage should also have the
feature to protect from silent data corruption and end
to end file integrity features
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
88 16 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Storage system performance are governed by
IOPS/MBps and latency. We request bank to help in
understanding how many IOPS per transaction is
expected and at what block size
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
89 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases. Construction of this
separate Database snapshot and enabling this audit
trails must not cause any major systemic
issues/challenges in smooth functioning of primary
DB.
Please confirm if the downstream departments need
to have dedicated storage environment or can use the
same central storage using a dedicated volume from
the storage
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
90 19 Critical Functional
Requirements:Disaster Recovery
The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
What is the expected RPO ? Bidders to provide their best RTO/RPO for solution (s) proposed.
91 20 1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
The cheapest storage available is tape. We assume
that the archival system needs to ensure that the data
archived to low cost storage is still part of the same
filesystem. Please confirm.
Also, kindly mention the expected frequency of data
archival
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
92 20 5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
As the production data will be in disk or flash based
system, we understand that the tape based backup is
still acceptable to Bank as long as the tape library
offers the high availability using dual robotics and
provides scheduled automatic media verifictaion.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
93 20 6 Mixing and Merging data from Current/Live to and
from Archival must not result in any significant loss
of performance and response time
Any change required on archived data requires the
data to be moved back to the faster tier from lower
tier first. This shouldn't result in significant
performance or response time loss. However, we
would suggest a minimum throughput to be asked
from tape archival system for both archival and
retrieval.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
94 20 2 Cloud integration / data transfer to and from
public/private/hybrid cloud should be available
using all standard protocols. (Web requests /
secure transfer channels etc)
Please confirm if there is a need to archive data to a
public cloud storage from the production storage.
At present, as per the Bank's IS policy migrating/storing data in public cloud is
not permitted. However bidders may propose, as an alternative, use of cloud
(public cloud, private cloud, on-premise etc) in addition to the best integrated
proposed solution.
95 21 2 Provide trust – The system should be able to
ensure the users that they are accessing data from
the right source of information.
Please elaborate. The authenticated access
mechanisms and user role group matrix ensure this. Is
there any other trust level that needs to be genrated
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
96 21 3 Provide auditability – the solution should record
any access to the data to satisfy compliance audits.
For example, it should be able to check on who
touched the data, when did they touch it, is there a
chain of custody issue, is there transparency in
terms of data privacy and protection etc.
What is the duration of the user activity logs / queries
that would need to be stored ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
97 21 4 Enforce security and privacy – Data inside the NEXT-
GEN DW will be accessed by only authorized users.
Data at rest/in-motion should be encrypted
What is the level of encryption to be achieved. (356,
2048)
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
98 21 5 Capability to classify and store (personal
identifiable information) sensitive data in
encrypted /masked form and should have capability
to decrypt/unmask such information in NEXT-GEN
DW when required by only authorized ID’s.
Will the bank provide the classification tables, or a
workshop has to be conducted to arrive at the PII
data. Level of encryption is also required (256 or 2048,
etc)
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
99 22 14 Parallel processing: Data governance tool should be
able to handle 500 concurrent users, scalable up to
1000 users in next 5 years, running any kind of job
(eg: Data Lineage on simple/medium/complex jobs
running on multiple tables)
We understand this this concurrent jobs are by way of
queries relating to information from the catalog and
related to data lineage. Please confirm.
Yes
100 22 14 Parallel processing: Data governance tool should be
able to handle 500 concurrent users, scalable up to
1000 users in next 5 years, running any kind of job
(eg: Data Lineage on simple/medium/complex jobs
running on multiple tables)
Please clarify if the definition of concurrent users
means concurrently logged into the platform or
concurrently executing a job/ query. (The latter is a
fraction of the former)
Concurrently executing a job/ query.
101 22 13 Masterdata Management Capability: Master Data
Management tool (s) should deliver consolidated,
complete and accurate view of business-critical
master information to all the operational and
analytical systems across the Bank.
Please specify whether bank has implemented Master
data management solution earlier across SBI. If yes
please provide the details of the same
At present, DWH doesn’t have MDM solution
102 23 8 Security features: - to help in protecting the
information contained in the data dictionary.
Kindly elaborate the security features
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
103 24 Security and Compliance - 1 Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups. It should be
possible to quickly create and revoke the identity of
a user or a service by simply deleting or disabling
the account in the directory. Multi-factor
authentication is desired as an additional layer of
security for user sign-in and transactions.
There are requirement on Identity and Access
Management as well as User access management
administration.
Do we have to propose and provide Identity and
Access Management (IAM) solution?
Need clarification on if we should integrate with an
existing IAM or do we need IAM solution to be
implemented for NextGen DW.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
104 24 1 Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups. It should be
possible to quickly create and revoke the identity of
a user or a service by simply deleting or disabling
the account in the directory. Multi-factor
authentication is desired as an additional layer of
security for user sign-in and transactions.
Would the SI be required to use the Bank's
Authentiction and Identity Management Systems, or
integrate with the Bank's PIM etc.
Please provide any suitable details reagrding the IAM
modules of the Bank.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
105 25 Security and Compliance - 6 Data Leakage - Security CIA parameters should be
achieved, and tools should be able to find and alert
on Data leakage
Are there existing Data Leakage (Loss) prevention
tools which can be leveraged?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
106 25 Security and Compliance - 8 Compliance to Global Standards Among the standards provided, Can we get the list of
standards we can consider while scoping?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
107 25 8 Compliance to Global Standards , GDPR, BCBS239,
PCIDSS, DFRA and similar relevant standards
Please specify whether bank has implemented and
GDPR, BCBS239, PCIDSS,DFRA solutions in part. Please
provide further details on these requirements as this is
very broad and comprehensive area of solutioning
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
108 25 5 Data Democratization - Secure access of PROD
Database to LHOs/GOCs
Please elaborate on the levels of access to the LHO
and GOCs.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
109 25 6 Data Leakage - Security CIA parameters should be
achieved, and tools should be able to find and alert
on Data leakage
Does the Bank have any DLP in place or is the SI
expected to implement a DLP in parallel.
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
110 25 8 Compliance to Global Standards
GDPR, BCBS239, PCIDSS, DFRA and similar relevant
standards
Office of Foreign Assets Control (OFAC)
Financial Crimes Enforcement Network (FinCEN)
Securities and Exchange Commission (SEC)
Office of the Comptroller of the Currency (OCC)
etc.
Request to filter out the standards relevant to the
Local Indian Standards. Need to classify the
International Branch locations and then apply the
respective regulations local to that nation. Does the
Bank expect another classification in the Data Lake
based on the Country of Origin and respective
Regulation.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
111 26 1 Logging of operational activities: Support the
logging of all user activities without slowing down
the performance.
Log Rentention Periods are required. Based on the
Logs estimate the Storage requirements are to be
calculated. Similarly the retrieval speed of the
logs/audit trail will set the media physical attributes
like flash/etc…
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
112 26 2 Should have metadata enabled reporting
mechanism on run time log.
Does the Bank expect a separate Analytical Model to
be built for User Logs for audit trail.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
113 26 4 Extracting useful information from system logs to
understand the efficiency of system and any fraud
Extraction time and query parameters need to be
arrived at. Does the bank have any such parameters
for Fraud which they can pass on to the SI so that the
query can be built
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
114 26 6 Vendor to propose solution with cheap storage
options for log storage of all the Bank’s applications
and mechanisms to extract requested information
from the logs as and when required
Storage cost would depend on the speed of the query
results. Bank to provide the parameters.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
115 28 Data Science Platform with AI/ML
Capabilities (Sl No.2)
Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS). Migration of
such models to new solution
How many such models in IBM SPSS are we expected
to migrate? Also do you continue to use the same
model or rebuild them on new platform if required?
This information is not required at this stage of EOI.
116 28 Data Science Platform with AI/ML
Capabilities (Sl No.1)
Implementing end to end analytics use-cases as
mandated by the Bank
How are these analytics usecases getting
deployed/consumed/used(esp realtime)? Though
Online platforms/Mobile Apps/CBS/YONO etc?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
117 28 Data Science Platform with AI/ML
Capabilities (Sl No.3)
Availability Pre-build models which can be directly
used with Bank’s data to get insights
What are the kind of prebuild models are we
expecting?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
118 29 Data Science Platform with AI/ML
Capabilities (Sl No.15)
Build and Publish detailed reports, insights on web
portal
What kind of reports are we talking about, model
performance or output based reporting
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
119 29 Data Science Platform with AI/ML
Capabilities (Sl No.19)
Integration with R, Python, Keras,
Tensorflow,Theano, scikit-learn etc and other
frameworks / languages
Do we have any models deployed on the Open source
softwares such as R/Python etc.
If Yes, how many such models are needed to be
migrated? Also counts by Softwares used (ex: how
many in R/Python etc)
If No, does the bank have suitable policies around
Open Source Analytics Software usage? Which of the
Open Source Softwares are approved?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
120 29 Data Science Platform with AI/ML
Capabilities (Sl No.21)
Support on demand basis Please clarify what "Support on demand basis" implies
? Would this be AMS alone or support shall also
include training .
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
121 29 Data Science Platform with AI/ML
Capabilities (Sl No. 13)
Access management at the data, workflow and
models
Please clarify on how many users are expected to use
the platform?
Please refer Hardware Specification subsection, point number #34 on page
#35
122 30 6 Reporting on all types of available of Data Formats; Please share additional details on the nature of
reporting required on unstructured data, click stream
data, images, videos, audios etc.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
123 32 Critical Functional Requirements:
Hardware Specifications
The proposed solution envisages use of commodity
hardware, and if any proprietary components are
used, should be listed in the response with details
and justification.
Need Clarity if SBI looks for any specifc platform here
like Power or Intel or its upto Bider to decide?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
124 32 3 The proposed solution envisages use of commodity
hardware, and if any proprietary components are
used, should be listed in the response with details
and justification
While this point envisages the use of commodity
hardware, the next point i.e. point 4 states that
vendor proposed hardware is expected to be
enterprise class and best of breed. Kindly provide
more clarity on this ask. Typically, we have seen banks
go for commodity hardware for the data lake, but best
of breed, enterprise class, non-commodity hardware
for the Data Warehouse
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
125 33 Critical Functional Requirements:
Hardware Specifications
The Vendor is required to supply, install, test,
commission, monitor, manage and maintain the IT
System along with operating system and other
peripherals with one-year warranty and AMC for 4
years from the date of delivery at data centers
advised by the Bank
Need Clarity if BAU support (day2 operation support)
needs to be considred ?Whether existing network
Infra to be used and bidder needs to provide only TOR
switches for NEXT-GEN DW systems?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
126 33 4 Vendor proposed hardware is expected to be
enterprise class, best of breed, tested and stable
release
Enterprise class hardware is not necessarily built using
pure play commodity components. For better
reliability of the hardware, it is requested to ask for fit
for purpose hardware which will help bank achieve the
desired business objectives.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
127 33 5 The proposed architecture considers vertical and
horizontal scalability as one of the most important
design principles.
Not all hardware required to run the solution will give
both horizontal and vertical scalability. We request
you to please enforce the clause on a wherever
technically possible basis.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
128 33 9 The hardware will be delivered in a staggered
manner and Vendor to provide a plan for the same
While its possible to support the hardware for a
period of 7 years from the date of supply, it might not
always be possible for the same hardware to be
available for sale for a duration of 7 years for
staggered deliveries. Hence it is requested to ask for
backward and forward compatibility of the hardware
such as storage subsytem where both vertical as well
as horizontal scalability requirement is a must.
No change in standard clause of EOI
129 34 18 The proposed hardware is mission critical for the
proposed project and support of 24 X 7 with an
uptime of 99.99 % to be ensured by providing
support at PR, and DR site for a period of 5 years.
It is requested to ask for 6 Hours "Call to Resolution"
time in event of hardware failure for better uptime.
Also, please ask vendors how would they plan to
achieve this requirement
No change in standard clause of EOI
130 35 Critical Functional Requirements:
Hardware Specifications
The Hardware solution must be compatible to
integrate with various systems in the Bank
including but not limited to SOC, PIMS, NOC,
Command Centre, ITAM, Service Desk, ADS, and
SSO etc. at no extra cost. Vendor will have to give
appropriate support to the Bank during integration
with various components of IT environment.
Need clarity if SBI will provide Service desk/Ticketting
tool and other tools that requires licences.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
131 36 35 The web portal of Business Intelligence tool should
support at-least 25000 concurrent users, scalable
up to 75000 in next 5 years, accessing various
reports generated
Please share additional details on this user base such
as user categories, size of this user population in terms
of number of users and their usage profile. Please also
share additional information on how the estimate of
25000/75000 concurrent has been arrived at. This will
help us understand and size for the requirement.
The usage profile could cover information such as the
number of report/dashboard refreshes or queries each
such user is expected to fire during a working day or if
in a particular part of the working day, there is any
window when higher concurrency is expected for
some specific information, etc. (such as all branch
managers refreshing a dashboard between 9-10 am on
Mondays)
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
132 36 35 The web portal of Business Intelligence tool should
support at-least 25000 concurrent users, scalable
up to 75000 in next 5 years, accessing various
reports generated
Please clarify if the definition of concurrent means
concurrently logged into the BI platform or
concurrently executing a report/dashboard refresh or
ad hoc query. (The latter is a fraction of the former)
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
133 29, 40-43 Data Science Platform with AI/ML
Capabilities (Sl No.24)
Analytics on real-time data in real-time/near real-
time
Which of the Appendix F: Usecases are currently or
expected to be deployed realtime/near realtime?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
134 11 Eligibility Criteria Vendor should have existing Next-Gen Data
Warehouse solutions as mentioned in EOI
Please advise if a vendor will qualify only if they have
delivered all solutions in a single client per annexure B
or solutions delivered across mutiple client in small
piece would be considered
Please refer to Corrigendum.
135 11 Eligibility Criteria Solutions should have been implemented in at least
2 large scale organisation
please define size of large scale organisation. should it
be an indian organisation or an entity outside india
should be okay
Please refer to Corrigendum.
136 General query Does current scope of work covers only domestic
operation or it is for international operation as well
Current scope of work covers both Domestic and International operations of
the State Bank of India (SBI) Group including subsidiaries.
137 3 Schedule of event last date of submission - 22 April 2019 we request for 3 week extension of submission
timeline
No change in timelines of EOI
138 General query Can you provide an IT Org chart ? This information is not required at this stage of EOI.
139 General query Is there a current IT strategy and if yes, can you please
share with us?
This information is not required at this stage of EOI.
140 General query Is there a current IT roadmap/plan and if yes, can you
please share with us?
IT Roadmap of DWH department is already covered in EOI. Hence separate IT
Roadmap is not required.
141 General query Can you share any documentation around High-level
system architecture depicting the current IT
landscape? 32 use cases have been specified in
annexure E. Would it be possible to prioritize these
use cases to allow for an efficient and orderly
evolution of the Next Gen DW?`
This information is not required at this stage of EOI.
142 General query Can you please specifically indicate any in-
progress/planned projects which may have conflict
with this effort from functionality standpoint? What is
the engagement timeline you envisage based on your
internal IT roadmap strategy, taking into
considerations any big upgrade or software
development conflict you see in near term future
This information is not required at this stage of EOI.
143 General query Can you provide the existing IT security policies? Annexure F gives the Summary of SBI Internal IS Policies.
Detailed information is not required at this stage of EOI.
144 General query Can you please provide a list of all the IT systems,
providing brief description of them?
This information is not required at this stage of EOI.
145 General query Is your landscape currently leveraging any cloud
services, if so on what infrastructure ?
This information is not required at this stage of EOI.
146 General query From the current environment , are there areas that
are working well and proposed to be retained for the
next gen DW with minimal changes?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
147 26-27 Data Encryption & Masking General query Can you please share encryption standards used for
authorizing users
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
148 General query Can you kindly share a list out of downstream
applications if the jobs you have mentioned include
the downstream data processing jobs as well
This information is not required at this stage of EOI.
149 General query The EOI mentions that existing infrastructure
components/ETL jobs/schedulers etc. will need to be
reused. Can the bank provide additional details on the
existing architecture/system components (e.g.,
different platforms, programming languages used,
etc?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
150 13-15 Data Ingestion General query Can you kindly share the current system architecture
comprising of list of no# of sources, including ERP's,
stand-alone bespoke applications, web applications
including, and file systems etc ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
151 18-19 Migration from Existing Setup to
Proposed Solution
General query Can you kindly share the list of software, technology
stack being used across IT landscaped, specifically
around ETL and BI reporting ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
152 18-19 Migration from Existing Setup to
Proposed Solution
General query Please advise any ETL tool that you are currently using
for data integration.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
153 18-19 Migration from Existing Setup to
Proposed Solution
General query Can you advise any reporting platform or tools
currently used for operational, management or
financial reporting?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
154 General query From the current environment , are there areas that
are working well and proposed to be retained for the
next gen DW with minimal changes?
Duplicate Query. Refer to Sr.No. #146
155 General query How many existing applications will need to be
migrated to the new infrastructure? Does the bank
have any preference on the migration approach (e.g.,
big bang vs staggered)
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
156 13-15 Data Ingestion General query Performance benchmarking for the Next Gen DW – are
there any bank specified/minimum standards (e.g.,
latency/throughput requirements)?
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
157 23-24 Data Quality General query The EOI includes data quality as one of the key
requirements. Would the same data quality standards
need to be applied across historical vs operational
data (in most banks, historical data standards are
typically lower). Also, given the magnitude of the data
cleansing effort, would certain use cases/data domains
need to be prioritized (e.g., data required for RBI
reporting)?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
158 13-15 Data Ingestion General query What is your current performance benchmark for
throughput while sourcing data ?
This information is not required at this stage of EOI.
159 40 Functional Use Cases General query 32 use cases have been specified in annexure E. Would
it be possible to prioritize these use cases to allow for
an efficient and orderly evolution of the Next Gen
DW?`
This information is not required at this stage of EOI.
160 13-15 Data Ingestion General query What is current process for fixing rejections, data
quality issues and data anomalies ?
This information is not required at this stage of EOI.
161 19-20 Disaster Recovery/Data Archival and
Backup
General query Would you kindly share the current SLA for RTO ? This information is not required at this stage of EOI.
162 19-20 Disaster Recovery/Data Archival and
Backup
General query Can you kindly share any current version control tool
s/w exclusively being used across systems and the
protocol followed for RTO ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
163 19-20 Disaster Recovery/Data Archival and
Backup
General query Can you kindly share data retention period for the
mentioned systems per Annexure G
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
164 23-24 Data Quality General query Can you share the existing data quality tool and
specific features used on daily basis ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
165 23-24 Data Quality General query What is the current process of fixing data quality
issues ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
166 23-24 Data Quality General query Can you please proivde details around profiling being
used , i.e is profiling done across all data or
master/dimenaional data only ? No# of such entities
profiling is based on as of today ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
167 23-24 Data Quality General query What is the SLA for data correction and the process as
it exists in today's world ?
This information is not required at this stage of EOI.
168 24 Data Reconciliation General query Is data reconciliation to be managed at report level for
at data base level ? And consequent thereof, would
there be a worklfow process required ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
169 26-27 Data Encryption & Masking General query Are there any BFSI rule-set readily available which the
client is currently aligned to in context of PII ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the
170 26-27 Data Encryption & Masking General query Are there any such rule-set expecting updations which
will impact the efffort in consideration ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
171 30 Business Intelligence Tools General query Can you provide list of current KPIs getting
monitored? Can you provide break of
dashbaords/reports by functional area -- e.g. Reg
reporting, Customer aquistiion, customer retention etc
Please refer to requirements given in the EOI.
172 30 Business Intelligence Tools General query Can you please share the total no# of reports, No# of
dashboard, no# of KPI's tracked and reported as part
of regulatory reporting ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
173 30 Business Intelligence Tools General query What are the other reporting solution needs for the
client and the expected fucntionality thereof ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
174 30 Business Intelligence Tools General query What is the scope/tenacity of analytics planned to be
used based on use cases shared ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
175 General query Can you share any or all software licenses currently
used for the existing Data ware house solution
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
176 General query Can you share any or all software licenses currently
used from upstream and downstream standpoint and
your preference thereof to related product suite
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
177 General query How open are you to provide VPN connectivity for
people logging from outside your office working on
development effort?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
178 18-19 Migration from Existing Setup to
Proposed Solution
General query Can you specify how many
master/transactions/dimension/facts/aggregate you
have in your ODS/DWH
Please refer to Annexure D.
Other details are not required at this stage.
179 30 Business Intelligence Tools General query How many dashboards do you currently use and total
no# of reports relaetd to dashboards?
This information is not required at this stage of EOI.
180 17 Data Processing Framework General query What is the key ask for AI/ML technologies other than
performance, can you please leaborate upon them?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
181 23-24 Data Quality General query What are the key issues faced as part of data
governance which you would like to address on
priorty?
Please refer to Data Governance section of EOI on page #21
182 Pg 15 3 A multi-temperature data management solution to
be proposed by vendor where data that is
frequently accessed on fast storage—hot
data—compared to less-frequently accessed data
stored on slightly slower storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should also be capable
automated storage tiering and seamless data
transfer between hot, warm and cold storage. Data
residing in any of these storage areas must be
seamlessly mixed / merged according to
requirements without impacting performance.
1. What is the criteria based on which movement of
data needs to be defined.
2. Duration of data residing in each tier (hot /warm
and cold ) should be defined.
3. What is the amount of daily data data (new data)
coming in storage.
4. Expected Response time from storage system
should be defined
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
183 PG 16 4 Storage replication (e.g. RAID) should be
automatically managed by the platform.
Synchronous or asynchoronous replication should be
defined and RAID technonology should be removed.
No change in standard clause of EOI
184 Pg 16 5 Tool should have capability to store/swap data in
memory, disk and distributed storage areas
depending on the age of the data determined by its
usages through user queries.
Memory should be removed. Swapping can happen at
storage/ Disk level.
No change in standard clause of EOI
185 Pg 16 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Storage performance metrics IOPs/ throughput and
resposne time should be defined.
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
186 Pg 19 4 The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
SLA for RPO needs to be defined along with RTO Bidders to provide their best RTO/RPO for solution (s) proposed.
187 General query Is Deloitte expected to procure all hardware, tools,
links etc.?
This information is not required at this stage of EOI.
188 General query Is solution will be deployed at SBI DC and DR? Yes, solution will be deployed at SBI DC and DR
189 13 2 Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Is the NOSQL data mentioned here JSON/XML ? What
kind of processing on NOSQL is expected ? Is there a
need to join the NOSQL data with other relational data
? Is there a need to shred the NOSQL data into
relational data ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
190 14 11 Data Ingestion
One of the most important feature is the richness
of the transformations to do day-to-day tasks, such
as;
Data conversion, lookup, expression, joining
records, splitting data, filtering, ranking, sorting,
grouping, looping, and combining data,
pivot/unpivot, converting dates, setting variables
based on parameter files, merging rows, finding the
latest file, and splitting data based on certain
conditions, running web methods, transforming
XML documents, rebuilding indexes, sending
emails, profiling data, handling arrays and records,
processing unstructured data, masking, monitoring
the inbound data flow for completeness,
consistency and accuracy, wizards to assist creating
complex packages, like loading fact tables, or type
two slowly changing dimensions (SCD – T2)
Are any tools / licenses already available with SBI for
ingesting data used in the existing DW solution?
Please provide a list. Any existing tools that align with
the new solution can be considered for reuse if found
to be a good fit.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
191 15 1 Data Storage
Vendor should propose effective number of data
storage layers in NEXT-GEN DW between data
ingestion and data consumption.
In the existing DW surrogate keys used?
If yes, is there a framework for storage and
management of the keys to ensure robustness of the
data warehouse?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
192 16 11 Data Storage
The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Will all these queries be executing in-flight at the
same time ? Or will these be initiated over a period of
time and the aggregate number of queries run during
that time period will be 600,000 ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
193 16 13 It should be possible to project and view data
through multiple modes using the Storage on NEXT-
GEN DW. Varieties of GUIs should be available to
project or view the output generated through
analytic processes. For instance: The Bank may
decide to implement use-cases that project
transactions data as a graph data structure. The
Storage solution on NEXT-GEN DW should allow for
such projections.
Do we need to enable graph based analytics as well or
it should be only limited to the facility to access the
data for graph based analytics.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
194 17 1 For the data to be accessible and consumable by
businesses / downstream applications, the NEXT-
GEN DW should have robust, highly efficient and
parallel execution of data transformation jobs.
We can enable JDBC/ODBC or REST API based access.
Will there be any specific mechanism to connect like
specific type of drivers needed for connectivity with
SAS system etc.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
195 17 2 Data Processing Framework
The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the expected latency requirements for real
time processing? The solution may based on use case
if requirement is for immediate processing vs latency
of upto 5-10 minutes.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
196 17 3 Framework should allow joining multiple
sources/tables/inputs etc.
The source here means the data ingested from the
multiple source systems and present on NEXT-GEN
DW platform not with the data at the source systems.
Yes
197 17 5 Framework should be capable of performing
validation checks pre-and post- processing.
What will be the outcome of validation? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
198 17 8 Data Processing Framework
Should have audit and error logs for auditing and
troubleshooting
Is there is requirement to maintain row-level
traceability of the data records, i.e. from the data
consumption layer backwards up to the originating
source file for a particular record?
Yes
199 17 9 Automatic recovery of data after failure/rejection
of record needs to happen without any manual
intervention
Is there any specific treatment that needs to be
performed for rejected records?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
200 17 11 Framework should have mechanism to protect data
at rest and at motion from unauthorized user
access and amendments.
Is it required to encrypt data at rest and in motion? Is
there a data masking tool available with the bank. Is
there a need for data masking
Yes, there is a need of encryption and data masking for data at rest & in
motion. Bidder is free to propose suitable tools/solution for this.
201 17 14 Data Processing Framework
Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real time mode every day.
Are bulk of the data transformation jobs expected to
be triggered during non-business hours, when user
reporting or other workloads are at a minimum? Are
there going to be users across multiple timezones or
large part of the user base will be within a single
timezone?
This information is not required at this stage of EOI.
202 17 17 The processing pipelines for ETL/ELT jobs also
include real time, daily, weekly, monthly, quarterly
and annual reports, feeding data structures for
downstream consumption. These activities are in-
scope for this engagement.
How many such reports are there and what is the data
model for them this is to estimate the number of ETL
required for the end system.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Hence Bidders
are free to propose suitable solution to meet the requirements of this EOI.
203 17 19 The workflows should work with standard
schedulers. Monitoring and management of
workflows should be possible from an easy to use
interface. Workflow management tool(s) should
have connectors / pluggable interfaces to already
existing / in-use proprietary software available with
the Bank. These could be (and not restricted to)
data repositories, reporting tools, data analysis
tools and generic interfaces for data transfer.
Scheduled jobs status should be made available to
the Bank in Monitoring dashboard on real time
basis.
What are the supported mechanism with proprietary
tool? Does that tool support REST based integration?
Which scheduling and monitoring tool does bank
have?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
204 18 1 Migration from Existing Setup to Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
Integration of real time data with the data on the
NEXT GEN DW is possible but does this requirement
mean to integrate with multiple source systems? Do
we get the access to the source systems directly.
This information is not required at this stage of EOI.
205 18 1 Migration from Existing Setup to Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
What is the defined reconciliation mechanism is it a
point in time based? Because all the system will be at
have different data based on the time a execution of
ETL job frequency.
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
206 18 2 Migration from Existing Setup to Proposed Solution
Data migration from Staging and Data Marts, user
tables and any other schemas identified by Bank.
Along with the Staging and Data Mart objects, is there
any integrated data layer in the existing solution? If
yes, then is it built using any proprietary data model?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
207 29 7 Data Science Platform with AI/ML Capabilities
In-memory computing & integration with Spark,
Redis, etc
What kind of analytic processing is expected on Spark,
Redis, etc. ? Is this using Spark-ML, for example ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
208 29 18 Data Science Platform with AI/ML Capabilities
All machine-learning platforms either support
multiple models out of the box or provide an
option to custom-code the same
What kind of use cases for ML are expected so as to
understand need for existing out of the box vs. custom
solutions ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
209 29 19 Data Science Platform with AI/ML Capabilities
Integration with R, Python, Keras,
Tensorflow,Theano, scikit-learn etc and other
frameworks / languages
Is there a need to connect with any other analytic
engines to be run in an parallelized/distributed
manner on the system ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
210 29 24 Data Science Platform with AI/ML Capabilities
Annexure E gives sample use cases which are to be
implemented on Next Gen Data Warehouse using
structured and/or unstructured and/or semi-
structured and/or any other kind of data gathered
from either Data Warehouse or Data Lake or Data
Virtualization or all together or any other source.
Is there a need to read data directly from a low cost
storage system and do complex analysis/queries
involving multi-table joins with curated relational data,
using SQL/analytic functions which requires
performance and scalability ?
Yes
211 35 33 Hardware Specifications
Next-Gen DW should support at-least 500
concurrent users, scalable up to 1000 users in next
5 years, running ETL/ELT jobs or doing ad-hoc data
extraction requests on database (Not including API
based access or scheduled job connections to
database)
What is the nature and expected concurrency of API
based access ? What is nature and concurrency of
scheduled job connections - are these ETL, or
maintenance related connections ?
Will the 1000 concurrent users be expected to be
running queris simultaneously, or is this just 1000
concurrent logons ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
212 36 37 Hardware Specifications
Ad-hoc jobs of any complexity should not hamper
the scheduled jobs performance.
What is expected mix of queries in terms of tactical
(very short), medium, long running (reports), batch
loads, near real-time and real-time loads running
simultaneously on the system ?
This information is not required at this stage of EOI.
213 18
Migration from Existing Setup to
Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
1. How frequent are changes to the code ?
2. What is the defined process to capture the change
requests?
3. What version control tool is currently being used ?
4. What is their current release management process?
5. What is definition of Minimum Parallel Run ?
-6. Can the work be done from Teradata offshore
locations or has to be done from onsite?
7. What is the current Testing Strategy ? Is there any
document?
8. Is there any Performance related expectations?
9. What is the inventory numbers of the objects ( if
possible by complexity ) of the existing DWH which
needs to be migrated?
10. Can the legacy code be shared for pattern analysis
? ( If not complete code base, then can a sample be
shared ?)
11. What kind of Test Automation Tools are used
currently?
12. What is the typical availability of customer SMEs
for UAT ?
13. Does the Customer have the business
reconciliation queries? In other words, How do they
verify the data loaded in the current environment?
14. Does the customer have an existing Test/QA
environment ?
15. What kind of documentation is required as a part
of deliverables?
1. This information is not required at this stage
2. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
3. Currently we are using IBM Stack. Bidders are free to propose solution(s)
in the best interest of the Bank to meet the requirements given in the EOI.
4. This information is not required at this stage
5. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
6. Onsite only
7. This information is not required at this stage
8. Bidder to provide details of performance benchmarking to enable us to
take a holistic and comprehensive view of the architecture in formulating
next course of action
9. This information is not required at this stage of EOI.
10. No
11. This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
12. This information is not required at this stage of EOI.
13. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
14. Yes
15. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
214 18
Migration from Existing Setup to
Proposed Solution
Data migration from Staging and Data Marts, user
tables and any other schemas identified by Bank.
1. How many more data marts are there other than 4?
2. Are these data marts built on different databases?
3. Does their DWH comprise of multiple data marts
only or they have an Integrated EDW and model
already in place?
4. Is there any Subject Area Priority (logical split of the
Next Gen DW) & anticipated sizing?
5. What are the SLA times of current ETL jobs?
6. Does the large tables are horizontally partitioned?
1. This information is not required at this stage of EOI.
2. No
3. This information is not required at this stage of EOI.
4. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
5. This information is not required at this stage of EOI.
6. This information is not required at this stage of EOI.
215 18
Migration from Existing Setup to
Proposed Solution
Data migration from existing archival solution to
new one.
1. What is the existing Archival Solution ? Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
216 18
Migration from Existing Setup to
Proposed Solution
Migration of existing data sourcing ETL jobs. 1. Do they maintain/update Data Mapping Sheet for
ETL /Data Ingestions ?
2. Is the data currently loaded in Batch or Mini-Batch ?
3. Do they have any Design and Coding Standards ?
4. Do they have document on the current ETL
Architecture/Solution, code patterns and their
complexity ?
5. Which Scheduler is being used ?
6. What is the current data volume and how much
data is ingested through different ingestion
mechanisms (batch, real-time etc) ?
7. Do they want to change any current tools ( ETL )
they have ? If yes, then what would be the tool stack?
8. Do they currently have an ETL Control Framework
implemented?
1. This information is not required at this stage of EOI.
2. Yes
3. This information is not required at this stage of EOI.
4. This information is not required at this stage of EOI.
5. Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
6. Please refer Annexure C & D
7. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
8. Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
217 19
Migration from Existing Setup to
Proposed Solution
Migration of monitoring dashboard data points 1. Is it to show the progress of migration underway?
2. Is there a need to build a migration framework for
future data migration ?
1. Yes, migration progress can be shown on monitoring dashboard along with
that Bidder is expected to migrate existing monitoring dashboard to new
setup
2. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
218 19
Migration from Existing Setup to
Proposed Solution
Migration of Data Governance, Data Lineage and
Data Quality rules and policies
1. What are the existing Data Governance, Data
Lineage and Data Quality rules and policies ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
219 19
Migration from Existing Setup to
Proposed Solution
Migration of All the remaining components of
existing ecosystem (Mentioned in Annexure - D) as
and when identified by Bank like job scheduler,
reports, history of version control, existing tape
backup, etc.
1. Will the access be provided to all systems - what
would be the constraints ?
This information is not required at this stage of EOI.
220 19
Migration from Existing Setup to
Proposed Solution
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
1. What are the source systems (ERP, CRM etc.) does
the current DWH have ?
2. What is the current downtime schedule for their
existing DWH ?
1. This information is not required at this stage of EOI.
2. This information is not required at this stage of EOI.
221 19
Migration from Existing Setup to
Proposed Solution
Vendor to review the existing architecture during
migration and remove duplication of data and
recommend improvements in overall setup if any
1. What are the deduplication rules ? Do they
currently have any defined?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
222 19
Migration from Existing Setup to
Proposed Solution
Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years in staggered manner during the
implementation of Next-Gen DW which will save
cost to the Bank. (Annexure D gives the technology
architecture of the current setup)
1. Need more detailed information for their existing
Infrastructure and eco-system.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
223 13 Scope of Work Team structure (without actual profiles) Does the bank anticipate OEM involvement in
implementing core OEM related services via the
Systems Integrator
Yes, please refer to point number # 16, under Hardware Specification on page
number # 34 in EOI for more details.
224 13 Critical Functional Requirements -
Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Ratio of split of Structured:Semi-Structured;
Unstructured (Images): unstructured (Videos) data.
This helps in solutioning taking into consideration
ground realities
Please refer to Annexure G
225 25 Critical Functional Requirements -
Elements based Reporting
Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Please elaborate with an example or 2 explaining
elements based Report to have a common
undersanding
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
226 42 Annexure E- Functional Use Cases
(Risk Area)
General Would the bank expect Graph Analytics capabilities in
the solution for better network analytics which helps
determine the betweeness and strength of network
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
227 25, 39 Critical Functional Requirements -
Regulatory Reporting
Automation –Tool should automate analytics and
reporting workflow end-to-end, including all data
collection, enrichment, and management, as well as
all calculations, processes to final report
submission. Currently 500+ jobs are being used for
Tranche 1 DCT generation along with 500 more for
other regulatory reports/returns.
Please provide the number of returns/reports for
regulatory body over and above the ones listed in
Annexure E (Sno 4)
This information is not required at this stage of EOI.
228 General General General By when is the RFP expected and by when is the bank
expecting to conclude this.
The reason for this question is, if the contract of
existing Datawarehouse ecosystem with existing
vendor is nearing completion, then this will have a
direct bearing on migration strategy as well as
extension of licencing till such time the migration from
existing to new system takes place (may take few
months)
This information is not required at this stage of EOI.
229 General General General We understand that SBI has an MDM solution, hence
the data quality issues, duplication must be under
control. What DQ tools and quality is expected for the
EDW
Please refer to Data Quality section of EOI on page #23
230 11 Eligibility Criteria Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Request bank to change the clause as Vendor/OEM
should have existing Next-Gen Data Warehouse
solution as mentioned in the EOI
or
keep the same eligibility criteria as per last tender
SBI/GITC/IDSPM/2017/2018/471 Dated: 18/03/2018
i.e - Bidder should have supplied & implemented at
least 3 orders of enterprise class x86 servers and
storage in India within the last 4 years. Minimum cost
of one single order value for x86 servers and storage
supplied in India should be INR 15 Crore in value.
Please refer to Corrigendum.
231 46 Annexure G- Next Gen Data Ware
House Sizing
Next Gen Data Ware House Sizing Specified
separatly for Data Ware House and Data Lake
Is Bank is looking for separate solution for Data Ware
House and data Lake . If Yes , Hope Structured data
only will be in DWH and semi-
structured/unstructureddata will be in Data Lake.
Request Bank to confirm this.
As clearly mentioned in the EOI requirements, Bank is looking for a integrated
solution having both Data Warehouse and Data Lake components.
Point #21 of Data Ingestion
Bidder should propose which technology is suitable for each kind of upstream
data ingestion like Data Warehouse, Data Marts, Data Lake, Use Data
Virtualization/Federation layer, etc.
232 16 6 User should be able to work on DB even while
backup is in progress. They should be able to run
statistics and reorganize their tables. Any
background process including backup must not
hamper performance of user queries.
We understand this requirement is for Online Backup
& Database level Archival as well
Yes
233 19 9 Migration of All the remaining components of
existing ecosystem (Mentioned in Annexure - D) as
and when identified by Bank like job scheduler,
reports, history of version control, existing tape
backup, etc.
Kindly confirm if existing data needs to be migrated to
proposed Backup Software. Please confirm existing
backup software details.
Yes, Currently we are using IBM Stack. Bidders are free to propose solution(s)
in the best interest of the Bank to meet the requirements given in the EOI.
234 18 3 Data migration from existing archival solution to
new one.
Kindly confirm details of existing Archival Software Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
235 18 3 Data migration from existing archival solution to
new one.
It is advisable to bring the Archival Data to original
production before migartion. Kindly add the same
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
236 33 6 Vendor to propose hardware specifications for each
component of Next-Gen DW ecosystem like Data
Warehouse, Data Marts, Data Lake, Data Archival,
Data Federation/Virtualization, Data Science
Platform, Backup, Sandboxes, Functional DR, etc.
for PROD, DEV and UAT environment as applicable
As Backup Software needs to be prposed here, kindly
confirm what features needs to be proposed for
backup software , like deduplication, Compression,
Encrption, Backup Data Replication, Bare Metal
Recovery , harware level application aware snapshot
backup etc
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
237 33 6 Vendor to propose hardware specifications for each
component of Next-Gen DW ecosystem like Data
Warehouse, Data Marts, Data Lake, Data Archival,
Data Federation/Virtualization, Data Science
Platform, Backup, Sandboxes, Functional DR, etc.
for PROD, DEV and UAT environment as applicable
Kindly confirm , Backup & Archival Solution needs to
be proposed for all applications ie Data Warehouse,
Data Marts, Data Lake, Data Archival, Data
Federation/Virtualization, Data Science Platform
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
238 34 16 Installation and Configuration of Storage and
Backup equipment with Hot, warm and Cold data
segregation
Please confirm if this will be "Disk to Disk to Tape"
backup at DC & DR .
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
239 20 1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
As we undertand , The Archival Solution is required for
File System & Database level Archival
Yes
240 20 3 All the applications connected to the non-archived
data should be available with archived as well
Kindly elaborate the expectations Access to archival solution is expected to be similar to production setup
241 20 5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Kindy confirm no. of tapes ? We understand, these
tapes data needs to be integrated to newly proposed
backup software
Yes, Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
242 14 7 Existing ETL Jobs to be Fine Tuned What kinds of ELT tools are being used currently? If
multiple tools are used, among current 3000+ jobs,
how many for different tools approximately?
Duplicate Query. Refer to Sr.No. #7
243 14 13 Trigger mechanisms in identifying any structural
changes at source
Can you help clarify what the structural changes are? Duplicate Query. Refer to Sr.No. #8
244 14 14 The ingestion tools should be able to perform a
change data capture on source systems of this
nature with run time decompression functionality
If the change data capture is in use now, what's the
vendor product? How many jobs are running on
different CDC tools?
Duplicate Query. Refer to Sr.No. #9
245 15 16 scrapping 4000-5000 logs daily having log size of ~
2 TB each scalable up to 10000 logs.
How many source applications are involved
approximately?
Duplicate Query. Refer to Sr.No. #10
246 15 17 Solution should be able to handle DDL change
without manual reorg/runstat.
Could you please provide use case about DDL change? Duplicate Query. Refer to Sr.No. #11
247 15 18 A job scheduler, along with process management
controls that provide things like runtime
monitoring and error alerting, handling, and
logging.
Is there any job scheduler used currently? If yes,
what's it? If more that one, please list as well.
Duplicate Query. Refer to Sr.No. #12
248 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot
How many downstream departments might be
engaged for DB snapshot approximately? Are they
branch IT or business department?
Duplicate Query. Refer to Sr.No. #13
249 17 12 ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Could you help elaborate some use case of AI/ML
featuer? Do you know any ETL tool with AI/ML
featuer?
Duplicate Query. Refer to Sr.No. #14
250 18 16 Reports on job status and success / failure /
retrigger should be sent to concerned stakeholders
on a continuous basis
Through which channels reports are sent out, email,
SMS, or some real-time monitoring dashboard? Please
help clarify.
Duplicate Query. Refer to Sr.No. #15
251 18 19 Workflow management tool(s) should have
connectors / pluggable interfaces to already
existing / in-use proprietary software available with
the Bank. These could be (and not restricted to)
data repositories, reporting tools, data analysis
tools and generic interfaces for data transfer.
What are these exact products of vendors of 'data
repositories, reporting tools, data analysis tools and
generic interfaces for data transfer'?
Duplicate Query. Refer to Sr.No. #16
252 18 3 Data migration from existing archival solution to
new one.
What's the current existing archival solution? Archived
to tape lib ? Please help clarify.
Duplicate Query. Refer to Sr.No. #17
253 19 6 Migration of monitoring dashboard data points. What is current dashboard product being used now?
It's an enterprise-level dashborad and will continue to
use, isn't it?
Duplicate Query. Refer to Sr.No. #18
254 19 8 Migration of Data Governance, Data Lineage and
Data Quality rules and policies
Is there any automation tool used for governance,
lineage, quality etc? What are they?
Duplicate Query. Refer to Sr.No. #19
255 19 9 history of version control, existing tape backup What does it mean of 'version control'? It's historical
version of programme / storedproc or a version
control tool?
Duplicate Query. Refer to Sr.No. #20
256 19 9 history of version control, existing tape backup Does it mean 'tape backup' migrates to data lake
alike? Will the historical tape backup be part of the
migration scope or leave the historical as it is, just take
care of these nowards?
Duplicate Query. Refer to Sr.No. #21
257 19 12 Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years
Could you please share the software product stack and
proprietary or open hardware specifics of existing
architecture?
Duplicate Query. Refer to Sr.No. #22
258 19 3 The NEXT-GEN DW – DR solution needs to be set
up at a remote location at Hyderabad.
What's the distance between PROD site and DR site?
What's the network bandwidth between?
Duplicate Query. Refer to Sr.No. #23
259 20 3 The architecture of NEXT-GEN DW should enable
interaction between public cloud and designated
edge servers alone.
Could you please elaborate with some detailed use
case?
Duplicate Query. Refer to Sr.No. #24
260 21 16 Back Dated Data changes needs to be updated on
portal
Could you please clarify with some case of Back Dated
Data change?
Duplicate Query. Refer to Sr.No. #25
261 22 3 Management of referential integrity Could you please clarify with some case of referential
integrity?
Duplicate Query. Refer to Sr.No. #26
262 23 7 Recommend Enrichment — Enhancing the value of
internally held data by appending related attributes
from external sources
Could you please give an exmaple? Duplicate Query. Refer to Sr.No. #27
263 23 10 Identity resolution - Identity resolution is the
process of linking various records and is the main
engine for record de-duplication, which can enable
some aspects of data cleansing.
Could you please give an exmaple? Duplicate Query. Refer to Sr.No. #28
264 24 1 Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups.
Are you looking for a standalone AIM specific for this
NEXT-GEN DW solution? Or there is already some
centralized AIM at enterprise-level, this new-
introduced AIM of this solution is required to integrate
or sync with this enterprise-level one? Please help
clarify.
Duplicate Query. Refer to Sr.No. #29
265 25 1 Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Does it mean a new regulatory reporting
framework/application is requested to replace the
existing one? Or just keep this regulatory reporting
framework/application but migrate the datastore to
NEXT-GEN DW? What is the current framework, from
which product vendor?
Duplicate Query. Refer to Sr.No. #30
266 28 5 Self-service portal to extract the data on their own
(Should support Data Democratization)
Is there any approval procedure, evaluation of
sharability, or masking/transformation prior to
extraction self-service and any data destory procedure
afterwards? Who are the main requestor and waht's
the frequency of request? Please help clarify.
Duplicate Query. Refer to Sr.No. #31
267 28 1 Implementing end to end analytics use-cases as
mandated by the Bank
Could you list all the analytics use-case or some typical
case?
Duplicate Query. Refer to Sr.No. #32
268 28 2 Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS).
Please advise the exact version of IBM SPSS Duplicate Query. Refer to Sr.No. #33
269 30 1 Capability to connect to various data sources Is there any BI tool being used at this moment? If yes,
what is it and the version?
Duplicate Query. Refer to Sr.No. #34
270 39 1 Database appliance what is the product of the existing applicance? E.g.
Teradata, Exadata?
Duplicate Query. Refer to Sr.No. #35
271 39 3 Scrapping of Jobs from Source What kind of scrapping tools are used currently? In
this solution, is there any preference to reuse these
tools?
Duplicate Query. Refer to Sr.No. #36
272 39 3 Scrapping of daily 5TB + logs What kind of logs here? Are they database logs? If yes,
what are all these database products, e.g. Oracle, DB2,
SQLServer, MySQL etc?
Duplicate Query. Refer to Sr.No. #37
273 39 4 Reporting Any reporting tool is used? E.g. Tableau etc. Duplicate Query. Refer to Sr.No. #38
274 39 6 Hardware Monitoring What is the Monitoring product and version? Duplicate Query. Refer to Sr.No. #39
275 39 6 Server memory and compute monitoring of total
80+ production servers
What type of these production servers, including
applicance or just commondity x86 servers? What
typical spec are they? How many years have they been
used?
Duplicate Query. Refer to Sr.No. #40
276 39 7 User database of 30000+ officials How many different types of these users? How many
data analysts, which can write and perform ad hoc SQL
query?
Duplicate Query. Refer to Sr.No. #41
277 39 8 Job Scheduling What is the current scheduling tool being used
currently?
Duplicate Query. Refer to Sr.No. #42
278 39 18 10TB Oracle database What version is used? Duplicate Query. Refer to Sr.No. #43
279 48 SCAPM IBM Smart Cloud What kind of applications are running on that? What's
the interaction with NEXT-GEN DW solution?
Duplicate Query. Refer to Sr.No. #44
280 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on
BI/ETL/Database Level. User concurreny at
BI/ETL/Database. Also what is type/make/core
information of ETL/DWH/ODS/DM Servers.
Refer to Hardware Specifications section starting on page # 32
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
281 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please
provide backup and other configuration details .
Existing backup are disk based /tape based/frequency
of backups
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
282 19 Disaster Recovery
clause 1
Bank proposes to setup only functional DR to start
with. At later stage Bank may take decision to
setup full scale 100% DR.
Please elaborate on functional DR in terms of PROD
capacity . Is DR being looked from Day 1?
Please refer to subsection Disaster Recovery on page no #19 of EOI document
283 19 DR Clause 4 The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
Since the volumes involved are large , the bandwidth
capacity and the time slots for replication will be
provided by the Bank. Please clarify
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
284 19 DR Clause 6 The proposed solution is expected to have a
monitoring engine that can determine the health of
production NEXT-GEN DW and raise alerts / trigger
remedial actions to bring NEXT-GEN DW – DR as
the default NEXT-GEN DW
Bank wishes to have an automation tool for the same .
Please clarify.
Please refer to subsection Disaster Recovery on page no #19 of EOI document
285 32 HW Specs clause 10 Vendor must provide detailed configuration of the
proposed Hardware, including Hosting Space
Requirements, Racks, Power, Cooling and any other
requirement for the fulfillment of the Vendor’s
obligation in this EOI.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
High level/approximate indication would suffice
286 35 HW Specs Clause 30 Vendor is required to provide the minimum
resources to monitor & manage the infrastructure,
however it is the Vendor’s responsibility to right
size the resources to meet the SLA
Clarification needed on Bank's expectations on
number of resources. Can you please explicitly
mention the SLA requirements.
This information is not required at this stage of EOI.
287 36 HW Specs Clause 44 Vendor need to propose a solution for data
migration / transfer between Existing DWH (Navi
Mumbai Location 1) and NEXT-GEN DW-PR (Navi
Mumbai Location) and also between NEXT-GEN DW-
PR (Navi Mumbai Location 2) and Hyderabad (DR)
or any other places for PR and DR decided by the
Bank.
Please share details of locations , approx distances ,
bandwidth capacity to be provided by Bank. Please
clarify.
This information is not required at this stage of EOI.
288 36 HW Specs Clause 48 The vendor should provide EXACT size needed for
production in the 1st year and estimated sizes for
consecutive years keeping in view the growth rate
predicted by Bank in this section and provide
empirical evidence for the calculation of growth
rate.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
High level/approximate indication would suffice
289 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of
existing EDW & 2. Breakup of tiers among the
80+ production servers.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
290 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Please refer Annexure G for sizing.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
291 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA
implementation for these use cases . Alternatively willl
the OFSAA feed from the DWH. Please let us know the
interplay between DWH and OFSAA.
This information is not relevant to EOI.
292 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a
complement to existing data solutions at the bank
(e.g. OFSAA) ?
This information is not relevant to EOI.
293 40 Annexure E Functional Use Cases - role of Next-Gen-DW in mission critical operational
processes e.g. daily RBI and regulator reporting,
related time financial crime detection, etc.
This information is not relevant to EOI.
294 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH
Solution. Information including but not limited to:
DWH platform, model, version and Size
Details of CPUs, Memory, OS, Database version
Details of Storage configuration: Size, capacity, free
and used space etc
Whether there are any physical or logical isolation of
DWH setup
HA, Back, DR details
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
295 13 Annexure B Performance Benchmark of Next Gen DWH Oracle provides Industry standard globally acceptable
metrices to measure Database performance like IOPS,
throughput and Load rate. We believe this addresses
the requirement
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
296 16 4 Storage replication (e.g. RAID) should be
automatically managed by the platform.
We believe this is a RAID level if not then please
elaborate more
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
297 16 8 Storage should support data compression. It should
be possible to perform both fast compression and
efficient compression based on data processing
needs.
We recommended Bank to additionally ask for Storage
and DWH systme capable of providing Coumnar based
as well row based compression. And data should be
readable without decompression
No change in standard clause of EOI. Bidders are free to propose solution(s)
in the best interest of the Bank to meet the requirements given in the EOI.
298 16 9 The storage should be horizontally and vertically
scalable. Redistribution of data across the NEXT-
GEN DW should be possible automatically and
seamlessly.
We recommended Bank to additionally ask for
"Storage upgrade and cpacity increase should be done
without dontime
No change in standard clause of EOI. Bidders are free to propose solution(s)
in the best interest of the Bank to meet the requirements given in the EOI.
299 16 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
1. Please provide details of queries and ETL jobs
2. Please provide ratio of Select/DML queries and ETL
jobs
3. Share the complexity of query- simple, medium,
complex Queries
4. Also provide Query defination of Simple, Medium
and Complex queries
This information is not required at this stage of EOI.
300 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases. Construction of this
separate Database snapshot and enabling this audit
trails must not cause any major systemic
issues/challenges in smooth functioning of primary
DB.
Please provide rationale and logic to have separate
processing and storage for this. Powerful DHW
systems are today capable of servicing all consumer
groups in parallel
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
301 17 12 ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Please clarify further with example This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
302 17 13 Existing reports and extracts generation jobs on
DWH should be analyzed and transformed to the
NEXT-GEN DW. The vendor should use preferably
off-the- shelf tools and not resort to building from
scratch.
Please share sample reports and extracts to propose
most suitable options for migration
This information is not required at this stage of EOI.
303 17 Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real
plz. Define complexity - simple, medium, complex
Share the split % of simple, medium, complex
Plz. Share sample jobs for each type
This information is not required at this stage of EOI.
304 27 User Management: Pt4-The access privileges
associated with each system product, e.g.
operating system, network, database, application
and system utilities, and the users to which these
privileges need to be allocated should be clearly
identified and documented.
Should we assume that the access privileges are to be
assigned to the users directly and managing access to
these privileged accounts is not required?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
305 39 User database of 30000+ officials Should we assume approx 30k users would access the
solution with a YoY increase of 10%?
Refer to Hardware Specifications section starting on page # 32
306 30 6 Reporting on all types of available of Data Formats;
· Structured, semi-structured, unstructured
· Click stream data
· Audit Logs
· Documents
· Multimedia data (Images/Videos/Audios)
· XBRL format
· IRIS iFILE framework
Please clarify on the type of database used for the
each of data formats asked for. Or Is it safe to assume
the underline data will be in as per industry specified
relational data format like Oracle , DB2 etc.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
307 30 5 Visualizations: BI tools must provide below
different types of visualizations;
· Animations, Barcodes
· Bar, line, pie, area and radar chart types
· Tables, Graphs, Infographics, Filters
· Widgets
· Drag and Drop Creation, Customization
· Templates
· Freehand SQL Command
· Geospatial Integration
· Layouts
· Themes
· Ability to mix and match various combinations
Please elaborate on the definition of
-Animations
-Infographic :what all visualization are you referring to
-Widgets, Templates
These are industry standard terms used in Visualization of data.
308 32 22 In-memory analytics: The product should pull data
into an in-memory or locally cached data store
preferably columnar is an increasingly popular
feature that enables very fast analytics once the
data is loaded.
To achieve the fast analytics, BI tool may adopt
different architecture. BI tool can easily leverage the
in-memory benefits of underline database without
pulling and creating data redundancy and henceforth
reducing the data manageability at BI Layer . Request
you to rephrase the this point as
"In-memory analytics: The product should pull data
into an in-memory or leverage the In-Memory
capabilities of underline database Or locally cached
data store preferably columnar is an increasingly
popular feature that enables very fast analytics once
the data is loaded."
No change in standard clause of EOI
309 32 23 Offline updates: BI tools, when storing copies of
the source data in an online analytical processing
(OLAP) cube or in-memory columnar data store,
should enable business users to schedule
automatic data updates.
Different BI tools have different architecture. BI tool
can easily leverage the capabilities of underline
database. Is this point relevant to the BI Tools whose
architecture is to store the data with BI Server.
No change in standard clause of EOI
310 32 28 Speed of access: Query performance will vary based
on the complexity of the queries and the amount of
data involved. Dashboards with multiple
visualizations will need to get query results from
many queries. The best practice is to create several
prebuilt query scenarios and compare how each
product performs based on these specific
examples. The worse practice is to just arbitrarily
rate the speed.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Requirement from BI Tool is a faster speed of access. Bidders are free to
propose solution(s) in the best interest of the Bank to meet this requirement.
311 32 29 The best practice is to establish a testing
environment to determine scalability in terms of
both the number of concurrent users and data
metrics, such as volumes, variety and veracity.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Requirement is to set up a testing environment by following best practice.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet this requirement.
312 32 32 Ability to handle and summarize huge volumes of
data. E.g. 30-40 million rows accessed on index and
summarized over 5 to 8 metrics.
Please elaborate the use case for consumption of 30-
40 million rows from BI. Usually BI tool leverages the
underline database to do summarization of data and
only works on the resulted dataset
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
313 35 36 The web portal of Business Intelligence tool should
support at-least 25000
concurrent users, scalable up to 75000 in next 5
years, accessing various reports generated
For doing the sizing of Business Intelligence we need
bifurcation of the concurrent users (25000)
Total Concurrent Users : 25000
Number of concurrent active : provide the concurrent
active user count
Number of logged/in-active : provide the
loggedin/active user count
Out of Active Users:
- Users executing BIEE dashboards (having 4/5 reports
or simple charts in a dashboard)
- Users executing large Pivot table operations (25000+
rows)
- Users executing (small to medium sized report - 50K
cells or lower) export to pdf/XL operations
- Users executing very heavy Graphics
Number of Active Concurrent running Extra Large
Reports
(Usually Extra Large Reports are executed off-line
hours)
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
314 15 point # 16 Proposed solution should be able to scrap
encrypted log, capture Metadata
changes at source level completely, scrapping 4000-
5000 logs daily having log
size of ~ 2 TB each scalable up to 10000 logs.
Proposed solution should be
capable of scrapping logs generated by any type of
Database. E.g. Oracle
Database, IBM DB2 Database etc.
Kindly specify if decryption of logs is also required or
only storage of such logs is fine ? If yes, is there
decryption logic available in specified system ?
Yes. Bidder to support the encryption/decryption logic at each source system
315 38 Annexure C -Monthly Data processed
in DWH Warehouse
Archived log extract CBS (SBI) +
TF (SBI)
Are these logs encrypted?
Do we need to keep the RAW logs into the system? Or
only processed logs ?
Yes logs are encrypted, Bidders are free to propose solution(s) in the best
interest of the Bank to meet the requirements given in the EOI.
316 29 # 9 (Data Science Platform with
AI/ML Capabilities)
GPUs to be incorporated in solution if possible
using HDFS Hadoop like
environment for better analytical results
1. Is there requirement to run AI/ML models within
HDFS Hadoop ? Or Expectation is to pull the data into
GPU based analytics workbench and then process.
2. Running AI/ML models within Hadoop is also faster
and Having separate GPU based system for specific AI
models can reduce the cost of GPU based solution.
Please suggest
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
317 15 # 1 - Data Storage A multi-temperature data management solution to
be proposed by vendor where
data that is frequently accessed on fast
storage—hot data—compared to lessfrequently
accessed data stored on slightly slower
storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should
also be capable automated storage tiering and
seamless data transfer between
hot, warm and cold storage. Data residing in any of
these storage areas must be
seamlessly mixed / merged according to
requirements without impacting
performance.
Kindly share the tentative timeline for Hot/Warm/Cold
data so that we could calculate the size. Example: Hot
Data - 6 months, Warm Data - 1 year, Cold data > 1
year etc.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
318 18 18 Transformations for this activity can be categorized
into the following types:
· Existing transformations in DWH that needs to be
migrated to NEXT-GEN DW
Please share the existing transformation details. This information is not required at this stage of EOI.
319 20 1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Please provide Retention period This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
320 20 5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Please provide the details of existing Tape Backup
Solution, Backup Window, Backup Throughput and
Restoration throughput
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
321 13 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Ingestion Sr #2
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Which RDBMS source data is required to be extracted
in real-time mode? Please provide the source system
name, RDMS type (Oracle/ SQLServer etc) and the
underlying OS
This information is not required at this stage of EOI.
322 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #5
Migration of existing data extraction and reporting
jobs.
Since this is across different products, is this expected
to be semi-automated / manual?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
323 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #6
Migration of monitoring dashboard data points. Since this is across different products, is this expected
to be semi-automated / manual?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
324 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #7
Migration of user details. Since this is across different products, is this expected
to be semi-automated / manual?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
325 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #8
Migration of Data Governance, Data Lineage and
Data Quality rules and policies
Since this is across different products, is this expected
to be semi-automated / manual?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
326 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Will these also be flat files? If not, what will be the
interface mode (RDBMS, webServices, API etc) and
which RDBMS?
Support from Data Ingestion for all possible types to be considered in solution
327 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Please provide a count of data sources (best
approximation, and of these how mant will be flat file
sources?
This information is not required at this stage of EOI.
328 23 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Quality Sr #12
Mechanism to capture feedback from end users to
report Data Quality issues
Please elaborate. Can this be implemented using
enterprise collaboration tooling / ticket maintenance
system?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
329 39 Annexure D - Existing Data
Warehouse Architecture Sr #14
Data Quality What are the existing Data Quality details? How many
and which entity masters are maintained? What is the
current count of each type of Entity and how are their
counts expected to scale up (volumetrics)?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
330 46 Annexure G Annexure G - Next Gen Data Warehouse Sizing What is rational behind the DWH size (592.16TB -
2251.96TB) at Primary and (19.59TB- 75.26TB) at DR
Site.
We have analyzed the data points which should be there in production and
DR. Rational is based on the same.
However, Bidders are expected to propose storage forecast over and above
given sizing in the solution to ensure fast performance of system.
331 11 Annexure A Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Bidder wants to understand if this criteria is applicable
for the bidder or for the platform(OEM) proposed as
part of the solution to the EOI
Please refer to Corrigendum.
332 11 Annexure A The company/firm should be profit making
organization for last 3 years.
Bidder requests this clause to be modified as below :
Bidder should be profitable organization based on
Profit After Tax (PAT) for at least two out of last three
financial years (2015-2016, 2016-2017, 2017-2018)
No change in standard clause of EOI
333 12 Technical Criteria/Scope of Work :
End state objectives
Log Storage/Archive Solution Bidder wants to understand if Archival solution is
required considering the bank wants to implement
both Primary DC & Secondary DR Site.
Yes, Archival solution is required as mentioned in 'Data Archival and Backup'
sub section on page # 20 of EOI
334 17 Data Processing Framework #14 at least 1500 jobs in parallel, scalable up to 5000 Bidder requests details of job types, ETL, ELT, System
Processes/Query and end user query along with load
type whether simple, medium, complex.
Also the bidder would like to understand whether the
jobs would run in batch or in near real time .
Kindly specify the % of the total number of jobs under
each category (namely Simple, Medium, Complex).
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
335 18 Data Federation/Virtualization Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution
Bidder request details of existing setup(product details
and HW)
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
336 18 Migration from Existing Setup to
Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution
Bidder request details of existing setup (Product
details and HW)
1.DW software and HW configuration of nodes
2.ETL software and HW configuration of nodes ..
3.Data Federation/Virtualization software and HW
configuration of nodes
4. BI sofware and HW configuration of nodes .
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
337 19 Disaster Recovery Bank proposes to setup only functional DR to start
with. At later stage Bank may take decision to
setup full scale 100% DR.
Bidder request details on location of DR. Bidder also
requests for additional details on the functional DR
requirement E.g.10% of Storage, 50% of ETL, 50% of
BI.
Storage requirements for functional DR are specified in Annexure G. Bidders
are free to propose solution(s) in the best interest of the Bank to meet the
requirements given in the EOI.
338 20 Data Archival and Backup Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Bidder requests bank to specify the kind of backup
archival required is which of the following
a) Disk to disk
or
b) Disk to Tape.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
339 22 Data Governance #14 Parallel processing: Data governance tool should be
able to handle 500 concurrent users, scalable up to
1000 users in next 5 years, running any kind of job
(eg: Data Lineage on simple/medium/complex jobs
running on multiple tables)
Bidder requests Bank to clarify whether these jobs are
subset of Data Processing Framework #14
No both are completely separate jobs
340 16 Data Storage #6 User should be able to work on DB even while
backup is in progress. They should be able to run
statistics and reorganize their tables. Any
background process including backup must not
hamper performance of user queries.
Bidder understand this requirement is for Online
Backup & Database level Archival as well. Kindly
confirm.
Duplicate Query. Refer to Sr.No. #232
341 19 Migration from Existing Setup to
Proposed Solution #9
Migration of All the remaining components of
existing ecosystem (Mentioned in Annexure - D) as
and when identified by Bank like job scheduler,
reports, history of version control, existing tape
backup, etc.
Kindly confirm if existing data needs to be migrated to
proposed Backup Software. Please confirm existing
backup software details.
Duplicate Query. Refer to Sr.No. #233
342 18 Migration from Existing Setup to
Proposed Solution #3
Data migration from existing archival solution to
new one.
Kindly confirm details of existing Archival Software Duplicate Query. Refer to Sr.No. #234
343 18 Migration from Existing Setup to
Proposed Solution #3
Data migration from existing archival solution to
new one.
It is advisable to bring the Archival Data to original
production before migartion. Kindly add the same
Duplicate Query. Refer to Sr.No. #235
344 33 Hardware Specifications #6 Vendor to propose hardware specifications for each
component of Next-Gen DW ecosystem like Data
Warehouse, Data Marts, Data Lake, Data Archival,
Data Federation/Virtualization, Data Science
Platform, Backup, Sandboxes, Functional DR, etc.
for PROD, DEV and UAT environment as applicable
As Backup Software needs to be proposed here, kindly
confirm what features needs to be proposed for
backup software , like deduplication, Compression,
Encryption, Backup Data Replication, Bare Metal
Recovery , hardware level application aware snapshot
backup etc
Duplicate Query. Refer to Sr.No. #236
345 33 Hardware Specifications #6 Vendor to propose hardware specifications for each
component of Next-Gen DW ecosystem like Data
Warehouse, Data Marts, Data Lake, Data Archival,
Data Federation/Virtualization, Data Science
Platform, Backup, Sandboxes, Functional DR, etc.
for PROD, DEV and UAT environment as applicable
Bidder requests for confirmation whether Backup &
Archival Solution needs to be proposed for all
applications ie Data Warehouse, Data Marts, Data
Lake, Data Archival, Data Federation/Virtualization,
Data Science Platform
Duplicate Query. Refer to Sr.No. #237
346 34 Hardware Specifications # 16 Installation and Configuration of Storage and
Backup equipment with Hot, warm and Cold data
segregation
Please confirm if this will be "Disk to Disk to Tape"
backup at DC & DR .
Duplicate Query. Refer to Sr.No. #238
347 20 Data Archival and Backup #1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Bidder understands that the Archival Solution is
required for File System & Database level Archival .
Please confirm.
Yes
348 20 Data Archival and Backup #3 All the applications connected to the non-archived
data should be available with archived as well
Request Bank to elaborate further on the scope of
work expected from Bidder under this criteria.
Access to archival solution is expected to be similar to production setup
349 20 Data Archival and Backup #5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Kindy confirm no. of tapes . Bidder understands that
the data in these tapes needs to be integrated to
newly proposed backup software . Kindly confirm on
the scope.
Yes, Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
350 20 Cloud Integration and Migration -
Point 5
In view of the intent to reduce the hardware
footprint (in future), the technical architecture of
NEXT-GEN DW solution should be flexible to
accommodate provisioning of NEXT-GEN DW on
cloud.
IS Bank expecting NEXT-GEN DW solution should be
flexible to accommodate provisioning of NEXT-GEN
DW on Public cloud ? Please confirm
At present, as per the Bank's IS policy migrating/storing data in public cloud is
not permitted. However bidders may propose, as an alternative, use of cloud
(public cloud, private cloud, on-premise etc) in addition to the best integrated
proposed solution.
351 36 Hardware Specifications - Point 45 Database should be linearly scalable which can
expand the database capacity by just adding more
nodes to the existing database.
Is Bank expecting to supply Hyper Converged
Infrastrcture ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
352 37 Hardware Specifications - Point 51 Vendor to submit all back-to-back agreement
copies between Vendor and SI / OEM / Parent
company etc if any and tenure of the back-to-back
agreement should be same as selected Vendor’s
agreement with the Bank
Please eleborate the type of Back-to Back agreement
with OEMs ?
Please refer to Corrigendum.
353 Page 20 Data Archival and Backup Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Please provide the details for Backup Window ? E.g ( 1
TB data has to be back in 1 hour on Tapes.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
354 Page 20 Data Archival and Backup Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Please provide the details for exsiting Archival storage
and retention policy.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
355 Page 33 Point 5 The proposed architecture considers vertical and
horizontal scalability as one of the most important
design principles.
Request to please reconsider vertical scalability
beause after 2-3 years servers processor availibity and
commercial imapct for particular processor will be very
high.
No change in standard clause of EOI
356 Page 33 Point 6 Vendor to propose hardware specifications for each
component of Next-Gen DW ecosystem like Data
Warehouse, Data Marts, Data Lake, Data Archival,
Data Federation/Virtualization, Data Science
Platform, Backup, Sandboxes, Functional DR, etc.
for PROD, DEV and UAT environment as applicable
Is Bank expexting to provide Infra for Sanbox
enviorment ? Please confirm.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
357 Page 33 Point 10 Vendor must provide detailed configuration of the
proposed Hardware, including Hosting Space
Requirements, Racks, Power, Cooling and any other
requirement for the fulfillment of the Vendor’s
obligation in this EOI
What will be the per rack power availability (in kVA).
Please confirm
Bidder to propose required power per rack
358 Page 33 Point 11 Vendor will supply hardware resources and related
services at the desired locations (Production and
DR)
Is Bank expexting to provide same manpower
resources at DR siteas compare to DC to maintain the
Infra. Please confirm.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
359 Page 34 Point 16 The Vendor shall also carry out OS Hardening, Anti-
Virus installation, Create Super user for the
Production, DR and UAT/Dev environment
according to Bank’s policy and secured
configuration document
Is Bank expracting from bidder to supply Antivirus
soltion ( License), Antivirus servers ? OR Bank will
provide AV license. Please confirm
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
360 Page 35 Point 23 System Administration Support- Service Provider
must provide 24X7 supports for the Administration,
Maintenance, Up-gradation (related to hardware)
and other related activity to keep system running
so that high availability can be assured
Is Bank expecting 24x7 onsite support from bidder for
prposed infrastrcture ? OR bidder can provide remote
support. Please confirm
Proposal should include 24x7 onsite support
361 Page - 36 Point 44 Vendor need to propose a solution for data
migration / transfer between Existing DWH (Navi
Mumbai Location 1) and NEXT-GEN DW-PR (Navi
Mumbai Location 2) and also between NEXT-GEN
DW-PR (Navi Mumbai Location 2) and Hyderabad
(DR) or any other places for PR and DR decided by
the Bank.
We undersatnd that there will no requirment of Near
DR. Please confirm.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
362 Page - 36 Point 45 Database should be linearly scalable which can
expand the database capacity by just adding more
nodes to the existing database. If the data volume
grows more hardware can be added and expand
the database capacity
Is Bank expecting to supply DB license OR Bank will
provide DB license under Bank EULA with OEM ?
Please confirm
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
363 Page - 36 Point 48 The vendor should provide EXACT size needed for
production in the 1st year and estimated sizes for
consecutive years keeping in view the growth rate
predicted by Bank in this section and provide
empirical evidence for the calculation of growth
rate.
Is Bank expecting to provide Hardware only for 1st
Year ? OR Bank expecting to provide Hardware with
five year sizing. Please confirm
Solution should include hardware sizing for complete duration of the project
364 Annex-D Page 39 270+ TB Data with compression index of 2.5 Is Bank using any Archival software with Archival
Appliance to archive the Data . Please confrim.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
365 Annex-G Page 46 Prod in TB Is bank expecting to proivde storage capacity for the
1st Year OR bidder has to provide the storage with 5th
Year. Please confirm
Bidder to propose the storage requirements from Year 1 to Year 5.
However, Bidders are expected to propose storage forecast over and above
given sizing in the solution to ensure fast performance of system.
366 24 Critical Functional Requirements Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups. It should be
possible to quickly create and revoke the identity of
a user or a service by simply deleting or disabling
the account in the directory. Multi-factor
authentication is desired as an additional layer of
security for user sign-in and transactions.
Do We need to Provide 2 factor Authentication
solutions or we need to just integrate with Existing SBI
2 factor Solutions? Please clarify
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
367 Security Do we need to provide entire Security solutions or can
we leverage any Existing SBI Security devices?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
368 34 point 16 The Vendor shall ensure all Installations &
Implementation to be done by OEM badged
racks for hosting including all required cabling & all
other activities required for installation of
hardware
Please clarify, whether Bidder to just provide
immediate connecting Networking Switches while SBI
to provide rest of the other needed infrastuctures
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
369 34 point 16 Installation and Configuration of Network
equipment
Please clarify, whether bidder to provide even
Networking Racks and passive cabling?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
370 35 The Hardware solution must be compatible to
integrate with various systems in the Bank
including but not limited to SOC, PIMS, NOC,
Command Centre, ITAM, Service Desk, ADS, and
SSO etc. at no extra cost. Vendor will have to give
appropriate support to the Bank during integration
with various components of IT environment.
Can Bidder leverage Existing PIM / DLP /WAF/
Firewall/IPS/IDS /LB /DAM solution deployed at SBI
will be extended for this engagement.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
371 13 Annexure B - Tech Rackspace, Power, Cooling, Network connectivity & Bidder to just provide the Replications and Internet
Bandwidth sizing while SBI will procure the same along
with necessary Routers
This information is not required at this stage of EOI.
372 34 All work related to patch panels will be done by
Vendor.
Bidder to just provide the Number of Networking
ports requires while SBI will Procure th Necessary
Networking switches
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
373 34 point 16 Installation and Configuration of Security
equipment
Do we need to provide Encryption/Decryption solution
other then Hadoop Native encryption?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
374 34 point 16 Installation and Configuration of Security
equipment
Do we need to provide the HSM Solutions if Data in
rest or data in motion is in non-Hadoop?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
375 34 point 16 Installation and Configuration of Security
equipment
Do we need to provide the PKI based / Token based
Authentications?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
376 34 point 16 Installation and Configuration of Security
equipment
do we need to proposed the DAM (Data base activity
Monitoring) ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
377 The hardware will be delivered in a staggered
manner and Vendor to provide a plan for the same
Please clarify this statement? As OEM don't provide
Commercials Valid for longer durations?
Commercials are not required at this stage of EOI
378 What all the Infra Monitoring Tools we can Leverage?
Please clarify
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
379 33 The proposed hardware must not fall into ‘End of
Support’ for at least 7 years from the date of
delivery to the Bank.
How many Years Total Contract will be? This information is not required at this stage of EOI.
380 33 point number 12 The Vendor is required to supply, install, test,
commission, monitor, manage and maintain the IT
System along with operating system and other
peripherals with one-year warranty and AMC for 4
years from the date of delivery at data centers
advised by the Bank
How Many years Post Implementation support
needed? Please clarify
This information is not required at this stage of EOI.
381 The proposed hardware is mission critical for the
proposed project and support of 24 X 7 with an
uptime of 99.99 % to be ensured by providing
support at PR, and DR site for a period of 5 years.
Does SBI requires total 5 Years support after Go-Live?
Please clarify
This information is not required at this stage of EOI.
382 13 Annexure B Technical Criteria/Scope
of Work
Detailed Migration Plan including timelines from
existing to new setup
The detailed Migration plan and timelines can only be
provided after completely understanding the existing
DWH solution.
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan
383 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. What is the current ETL tool being used by the bank?
Does the Bank expect vendor to propose the same
tool?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
384 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. Request the bank to elaborate more on this
statement. Does the Bank expect that the existing ETL
jobs keep running in the current DWH even after the
new DWH is functional?
Migrated ETL jobs will run on Next-Gen DW and are expected to be fine tuned
on new setup
385 14 Data Ingestion, Point 15 Vendor should list out all types of risks they expect
from the ingestion subsystem (e.g., dropping of
data packets during ingestion, security loopholes,
unprotected personally identifiable information,
etc.) along with mechanisms and processes they
would implement for mitigating such risks.
The detailed list of risks can be suggested only when
we have a detailed understanding of the banks current
systems.
Bidder to propose how best they can fulfill the requirement considering their
experience with Banking data.
386 15 Data Ingestion, Point 16 Proposed solution should be able to scrap
encrypted log, capture Metadata changes at source
level completely, scrapping 4000-5000 logs daily
having log size of ~ 2 TB each scalable up to 10000
logs. Proposed solution should be capable of
scrapping logs generated by any type of Database.
E.g. Oracle Database, IBM DB2 Database etc.
What is the mechanism of scrapping the logs in the
current DWH? Request bank to share the tool details.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
387 15 Data Ingestion, Point 17 Solution should be able to handle DDL change
without manual reorg/runstat. It should handle
network fluctuations and hindrances.
We understand that if there is any structual change in
a object at the source which is being ingested in the
new system, then that change if required needs to be
percolated to the new system and will need a manual
intervention. Please confirm our understanding.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
388 15 Data Ingestion, Point 21 Vendor should propose which technology is
suitable for each kind of upstream data ingestion
like Data Warehouse, Data Marts, Data Lake, Use
Data Virtualization/Federation layer, etc.
Request bank to elabrate on this requirement. Solution proposed by the Bidder should be capable of ingesting any kind of
data into appropriate components like Data Warehouse, Data Lake, Data
Marts or use of Data Virtualization depending on the requirement
389 16 Data Storage, Point 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Request the bank to elaborate on the kind of queries
the end users will be making on the storage layer. Also
please elaborate on the different types of users that
will be accessing the storage layer.
This information is not required at this stage of EOI.
390 16 Data Storage, Point 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases.
Please provide a list of all downstream applications
which will be consuming data from the data storage
layers.
This information is not required at this stage of EOI.
391 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the source systems from which data will be
ingested in real time?
This information is not required at this stage of EOI.
392 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the use cases for real time reporting? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
393 17
Data Processing Framework, Point 6
Have a workflow management and scheduling
solution to schedule data transformation, data
acquisition or data delivery jobs.
Allocations of separate workload channel to
designated queries
Request bank to elabrate on this requirement. This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
394 17
Data Processing Framework, Point 12
ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Request bank to elaborate more on this requirement This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
395 18 Data Processing Framework, Point 18 Transformations for this activity can be categorized
into the following types:
migrated to NEXT-GEN DW
not sourced by DWH
real time data capture
Request Bank to provide the details of the existing
transformations that needs to be migrated into the
NEXT-GEN DW.
This information is not required at this stage of EOI.
396 18 Migration from Existing Setup to
Proposed Solution, Point 1
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
The detailed Migration plan and timelines can only be
provided after completely understanding the existing
DWH solution.
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
397 19 Migration from Existing Setup to
Proposed Solution, Point 10
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
The list of risks expected during data migration can be
listed only after understanding the existing DWH
solution.
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
398 19 Migration from Existing Setup to
Proposed Solution, Point 10
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
The requirement of a downtime can be proposed after
understading the complete ecosystem of the curret
DWH solution.
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
399 19 Migration from Existing Setup to
Proposed Solution, Point 11
Vendor to review the existing architecture during
migration and remove duplication of data and
recommend improvements in overall setup if any
Does the bank intend to continue using the current
DWH solution even after the new solution is live?
Please elaborate.
Bank will take a final call at appropriate time.
400 19 Migration from Existing Setup to
Proposed Solution, Point 12
Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years in staggered manner during the
implementation of Next-Gen DW which will save
cost to the Bank. (Annexure D gives the technology
architecture of the current setup)
Request bank to share the complete technical stack of
the current DWH in order for us to suggest the best
feasile plan for the existing infrastructure.
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
401 20 Cloud Integration and Migration,
Point 1
NEXT-GEN DW should be able to consume data
from external cloud-based infrastructures.
Request bank to share the complete information
about the external cloud based source systems from
which data needs to be consumed in the new system
This information is not required at this stage of EOI.
402 20 Cloud Integration and Migration,
Point 5
In view of the intent to reduce the hardware
footprint (in future), the technical architecture of
NEXT-GEN DW solution should be flexible to
accommodate provisioning of NEXT-GEN DW on
cloud. The Bank understands that there can be
differences in services offered by cloud service
providers. The NEXT-GEN DW solution architecture
should be designed considering as-is infrastructure
availability in cloud.
We understand that the proposed solution should be
compatible as IAAS on the cloud platform. Please
confirm our understanding.
Yes. At present, as per the Bank's IS policy migrating/storing data in public
cloud is not permitted. However bidders may propose, as an alternative, use
of cloud (public cloud, private cloud, on-premise etc) in addition to the best
integrated proposed solution.
403 20 Cloud Integration and Migration,
Point 6
Adherence to global standards related to cloud Request bank to share the global standards mentioned
in this requirement.
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
404 21 Monitoring Dashboard, Point 16 Back Dated Data changes needs to be updated on
portal
Which portal is being reffered to in this statement? Portal refers to Monitoring Dashboard
405 22 Data Governance, Point 12 Metadata Management Capability: Tool should
cater to three broad categories of metadata;
Business metadata, Technical metadata and
Operational metadata
Does the current DWH solution have a Metadata
Management Solution? If yes, please share the tool
used .
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
406 22 Data Governance, Point 12 Metadata Management Capability: Tool should
cater to three broad categories of metadata;
Business metadata, Technical metadata and
Operational metadata
If there is an existing Metadata Management Solution,
do we need to migrate the existing metadata to the
new solution.
Yes
407 22 Data Governance, Point 13 Masterdata Management Capability: Master Data
Management tool (s) should deliver consolidated,
complete and accurate view of business-critical
master information to all the operational and
analytical systems across the Bank.
Does the current DWH solution have a MDM Solution?
If yes, please share the tool used .
No, DWH doesn’t have MDM solution
408 22 Data Governance, Point 13 Masterdata Management Capability: Master Data
Management tool (s) should deliver consolidated,
complete and accurate view of business-critical
master information to all the operational and
analytical systems across the Bank.
Request bank to elaborate more on the MDM solution
that is required.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
409 23 Data Quality, Point 12 Mechanism to capture feedback from end users to
report Data Quality issues
Request bank to elaborate more on this requirement Refer to Sr.No. #318
410 27 Data Encryption, Point 5 The overall SLA for data processing should be
adhered to, keeping data encryption as an
important activity.
What is the SLA for a data processing job? Please
provide details of all the SLA's that will be applicable
on the new system.
This information is not required at this stage of EOI.
411 27 Data Encryption, Point 6 Proposed Solution should be capable on ingesting
encrypted data from source system. It should
support the encryption/decryption mechanism
implemented at source system.
Request bank to share the encryption/decryption
mechanism implemented at source system.
This information is not required at this stage of EOI.
412 28 Downstream Data Consumption Self-service portal to extract the data on their own
(Should support Data Democratization)
Request bank to elaborate more on this requirement.
Does the vendor need to propose a portal solution for
this requirement?
Yes, Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
413 30 Business Intelligence Tools, Point 5 Visualizations: BI tools must provide below
different types of visualizations;
Please provide the use case for vizualizations having
animations.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
414 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the use case for reporting on
Multimedia data.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
415 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different
criteria defined for each type of user, such as
information consumer, business analyst and IT.
What are the different types of users that will be
accessing the Business Intelligence Solution?
Refer to Hardware Specifications section starting on page # 32
416 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different
criteria defined for each type of user, such as
information consumer, business analyst and IT.
Please provide count of such BI Power Users and BI
Recipient users.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
417 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
Request bank to elaborate more on this requirement
with an example.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
418 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
What is the peak user concurrency? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
419 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
What is the expected user growth year on year for the
next 5 years?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
420 32 Business Intelligence Tools, Point 31 There should be separate criteria for BI user online
help versus technical documentation.
We understand that a user guide needs to be provided
for the business users. Please confirm our
understanding.
Yes
421 32 Business Intelligence Tools, Point 31 Ability to handle and summarize huge volumes of
data. E.g. 30-40 million rows accessed on index and
summarized over 5 to 8 metrics.
Please share thedetailed list of SLA applicable for the
new system.
This information is not required at this stage of EOI.
422 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract for SBI INVM
Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)
* 31 (days) = 4212.9 (GB) Monthly File =>
2TB(Approx)
We assume that there are a total of 150 daily files and
1 monthly file that needs to be ingested into the new
system. Please confirm.
Please refer to Data Ingestion point # 20, page 15
423 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract for SBI INVM
Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)
* 31 (days) = 4212.9 (GB) Monthly File =>
2TB(Approx)
What is the normal refresh time for daily batch?
Please provide details about average start time and
end time for various stages of ETL, i.e. Source to
Staging, Staging to EDW, Aggregations, etc.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
424 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
We assume that there are a total of 4 file extracts
from the PSG system that needs to be ingested weekly
into the new warehouse. Please confirm
Please refer to Data Ingestion point # 20, page 15
425 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
We assume that there are a total of 4 file extracts
from the SBI Card system that needs to be ingested
weekly into the new warehouse. Please confirm
Please refer to Data Ingestion point # 20, page 15
426 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
Please provide the clarity of how the data will be
provided from ATM and INB systems.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given Data Ingestion, point #6 and point # 20 in the
EOI.
427 38 Annexure C - Monthly Data
processed in DWH Warehouse
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Monthly 200 GB(Approx)
How will the data be provided from the mentioned
source systems?
A) Please elaborate on the number of files/tables to
be ingested.
B) What will be the frequency by which the data will
be provided?
C) What is the format of data feeds coming from
various source systems? Is it delimited files Push or
RDBMS data read by ETL tool or any other method?
Please provide details.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given Data Ingestion, point #6 and point # 20 in the
EOI.
428 39 Annexure D - Existing Data
Warehouse Architecture
Database appliance
500+ TB of data with compression index of 2.5
We understand that the total compressed data in the
data warehouse is 500TB. What is the maximum
storage capacity of the current data warehouse
infrastructure?
This information is not required at this stage of EOI.
429 39 Annexure D - Existing Data
Warehouse Architecture
Data Sourcing and Extraction Jobs
17000 Data sourcing and data extraction jobs
What is the breakup of jobs
complexity(Simple/Medium/Complex) wise?
For the purpose of this EOI consider all jobs as complex in nature
430 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
We understand that this provides the number of
reports being generated from the current DWH and
needs to be migrated to the new system. Please
confirm.
These are indicative numbers given for reference. Actual numbers may
change in future.
431 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
What is the breakup of reports
complexity(Simple/Medium/Complex) wise?
For the purpose of this EOI consider all reports as complex in nature
432 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
Are there any new reports or dashboards that need to
be created in the new system. Please provide the
number of reports and dashboards along with the
complexity (Simple/Medium/Complex) breakup.
Yes, new reports and dashboards will be required to be created in new
system as and when required in future. Further details are not required at
this point of time. Bidders are free to propose solution(s) in the best interest
of the Bank to meet the requirements given in the EOI.
433 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
How many years of data has to be there in the
Interactive reports?
Interactive reports might be run on complete data set depending on future
requirements
434 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
Is there a requirement for Dashboards? If yes, please
provide the
Please refer to Business Intelligence Tools on page # 30
435 25 Regulatory Reporting, Point 1 Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Is it the requirement of SDMX format reporting? This information is not required at this stage of EOI.
436 25 Regulatory Reporting, Point 5 Data Lineage and Transparency – Tool should
retrace the journey of the source data through
every single workflow processes or calculations
across siloes systems all the way to disclosures.
Request bank to elaborate more on this requirement. This is an industry standard terminology
437 26 Regulatory Reporting, Point 11 Pre-submission Review – Multiple report writers
should allow users to review reports in various
formats before submission, with the ability to drill
down and make manual adjustments where
necessary.
What type of system is expected for review? Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
438 26 Regulatory Reporting, Point 12 Tool must support generation of reports in XBRL
format
What is the current mechanism being used to
generate XBRL format?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
439 22 Data Governance, Point 7 Automated propagation of changes to NEXT-GEN
DW Data Dictionary and business glossary by
multiple sources as and when changes occur in
source.
What level details are expected in Data dictionary and
business gloassary?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
440 14 Data Ingestion, point 7 Data sanity checks, automated reject processing,
validations and reconciliation of data should be
available as part of data ingestion solution to
ensure the integrity of data.
Automated reject processing is feasible for known
issues, however unforeseen one need to follow testing
and validation process.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
441 21 Data Governance Provide traceability – it should be possible to track
and visualize any data transformation or any rule
applied to data in the source system -> Next Gen
DWH -> Downstream systems
No query asked.
442 24 Security and Compliance Authentication and Identity Management - TCS Understand bank looking IAM solution for user life
cycle management, Can Vendor consider new IAM
tool deployemt for Next-Gen DW
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
443 25 Security and Compliance Data protection - It should be possible to protect
the data in the NEXT-GEN DW throughout its
lifecycle including data at rest and data in motion.
Data Leakage - Security CIA parameters should be
achieved, and tools should be able to find and alert
on Data leakage
Data Clasfication done for Next-Gen DW ? Is this bank
looking deployment of DLP tool ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
444 25 Security and Compliance Auditing - Audit or diagnostic logs should be used
to log management-related activities or data-
related activities. Log management and auditing of
all critical activities on NEXT-GEN DW is a critical
requirement. The Bank reserves right to ask the
vendor to produce / analyze logs for reporting
purposes.
TCS Understand Bank is looking DAM solution, Is this
understanding is correct ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
445 13 Annexure B Technical Criteria/Scope
of Work
Detailed Migration Plan including timelines from
existing to new setup
The detailed Migration plan and timelines can only be
provided after completely understanding the existing
DWH solution.
Duplicate Query. Refer to Sr.No. #396
446 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. What is the current ETL tool being used by the bank?
Does the Bank expect vendor to propose the same
tool?
Duplicate Query. Refer to Sr.No. #383
447 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. Request the bank to elaborate more on this
statement. Does the Bank expect that the existing ETL
jobs keep running in the current DWH even after the
new DWH is functional?
Duplicate Query. Refer to Sr.No. #384
448 14 Data Ingestion, Point 15 Vendor should list out all types of risks they expect
from the ingestion subsystem (e.g., dropping of
data packets during ingestion, security loopholes,
unprotected personally identifiable information,
etc.) along with mechanisms and processes they
would implement for mitigating such risks.
The detailed list of risks can be suggested only when
we have a detailed understanding of the banks current
systems.
Duplicate Query. Refer to Sr.No. #385
449 15 Data Ingestion, Point 16 Proposed solution should be able to scrap
encrypted log, capture Metadata changes at source
level completely, scrapping 4000-5000 logs daily
having log size of ~ 2 TB each scalable up to 10000
logs. Proposed solution should be capable of
scrapping logs generated by any type of Database.
E.g. Oracle Database, IBM DB2 Database etc.
What is the mechanism of scrapping the logs in the
current DWH? Request bank to share the tool details.
Duplicate Query. Refer to Sr.No. #386
450 15 Data Ingestion, Point 17 Solution should be able to handle DDL change
without manual reorg/runstat. It should handle
network fluctuations and hindrances.
We understand that if there is any structual change in
a object at the source which is being ingested in the
new system, then that change if required needs to be
percolated to the new system and will need a manual
intervention. Please confirm our understanding.
Duplicate Query. Refer to Sr.No. #387
451 15 Data Ingestion, Point 21 Vendor should propose which technology is
suitable for each kind of upstream data ingestion
like Data Warehouse, Data Marts, Data Lake, Use
Data Virtualization/Federation layer, etc.
Request bank to elabrate on this requirement. Duplicate Query. Refer to Sr.No. #388
452 16 Data Storage, Point 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Request the bank to elaborate on the kind of queries
the end users will be making on the storage layer. Also
please elaborate on the different types of users that
will be accessing the storage layer.
Duplicate Query. Refer to Sr.No. #389
453 16 Data Storage, Point 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases.
Please provide a list of all downstream applications
which will be consuming data from the data storage
layers.
Duplicate Query. Refer to Sr.No. #390
454 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the source systems from which data will be
ingested in real time?
Duplicate Query. Refer to Sr.No. #391
455 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the use cases for real time reporting? Duplicate Query. Refer to Sr.No. #392
456 17
Data Processing Framework, Point 6
Have a workflow management and scheduling
solution to schedule data transformation, data
acquisition or data delivery jobs.
Allocations of separate workload channel to
designated queries
Request bank to elabrate on this requirement. Duplicate Query. Refer to Sr.No. #393
457 17
Data Processing Framework, Point 12
ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Request bank to elaborate more on this requirement Duplicate Query. Refer to Sr.No. #394
458 18 Data Processing Framework, Point 18 Transformations for this activity can be categorized
into the following types:
migrated to NEXT-GEN DW
not sourced by DWH
real time data capture
Request Bank to provide the details of the existing
transformations that needs to be migrated into the
NEXT-GEN DW.
Duplicate Query. Refer to Sr.No. #395
459 18 Migration from Existing Setup to
Proposed Solution, Point 1
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
The detailed Migration plan and timelines can only be
provided after completely understanding the existing
DWH solution.
Duplicate Query. Refer to Sr.No. #396
460 19 Migration from Existing Setup to
Proposed Solution, Point 10
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
The list of risks expected during data migration can be
listed only after understanding the existing DWH
solution.
Duplicate Query. Refer to Sr.No. #397
461 19 Migration from Existing Setup to
Proposed Solution, Point 10
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
The requirement of a downtime can be proposed after
understading the complete ecosystem of the curret
DWH solution.
Duplicate Query. Refer to Sr.No. #398
462 19 Migration from Existing Setup to
Proposed Solution, Point 11
Vendor to review the existing architecture during
migration and remove duplication of data and
recommend improvements in overall setup if any
Does the bank intend to continue using the current
DWH solution even after the new solution is live?
Please elaborate.
Duplicate Query. Refer to Sr.No. #399
463 19 Migration from Existing Setup to
Proposed Solution, Point 12
Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years in staggered manner during the
implementation of Next-Gen DW which will save
cost to the Bank. (Annexure D gives the technology
architecture of the current setup)
Request bank to share the complete technical stack of
the current DWH in order for us to suggest the best
feasile plan for the existing infrastructure.
Duplicate Query. Refer to Sr.No. #400
464 20 Cloud Integration and Migration,
Point 1
NEXT-GEN DW should be able to consume data
from external cloud-based infrastructures.
Request bank to share the complete information
about the external cloud based source systems from
which data needs to be consumed in the new system
Duplicate Query. Refer to Sr.No. #401
465 20 Cloud Integration and Migration,
Point 5
In view of the intent to reduce the hardware
footprint (in future), the technical architecture of
NEXT-GEN DW solution should be flexible to
accommodate provisioning of NEXT-GEN DW on
cloud. The Bank understands that there can be
differences in services offered by cloud service
providers. The NEXT-GEN DW solution architecture
should be designed considering as-is infrastructure
availability in cloud.
We understand that the proposed solution should be
compatible as IAAS on the cloud platform. Please
confirm our understanding.
Duplicate Query. Refer to Sr.No. #402
466 20 Cloud Integration and Migration,
Point 6
Adherence to global standards related to cloud Request bank to share the global standards mentioned
in this requirement.
Duplicate Query. Refer to Sr.No. #403
467 21 Monitoring Dashboard, Point 16 Back Dated Data changes needs to be updated on
portal
Which portal is being reffered to in this statement? Duplicate Query. Refer to Sr.No. #404
468 22 Data Governance, Point 12 Metadata Management Capability: Tool should
cater to three broad categories of metadata;
Business metadata, Technical metadata and
Operational metadata
Does the current DWH solution have a Metadata
Management Solution? If yes, please share the tool
used .
Duplicate Query. Refer to Sr.No. #405
469 22 Data Governance, Point 12 Metadata Management Capability: Tool should
cater to three broad categories of metadata;
Business metadata, Technical metadata and
Operational metadata
If there is an existing Metadata Management Solution,
do we need to migrate the existing metadata to the
new solution.
Duplicate Query. Refer to Sr.No. #406
470 22 Data Governance, Point 13 Masterdata Management Capability: Master Data
Management tool (s) should deliver consolidated,
complete and accurate view of business-critical
master information to all the operational and
analytical systems across the Bank.
Does the current DWH solution have a MDM Solution?
If yes, please share the tool used .
Duplicate Query. Refer to Sr.No. #407
471 22 Data Governance, Point 13 Masterdata Management Capability: Master Data
Management tool (s) should deliver consolidated,
complete and accurate view of business-critical
master information to all the operational and
analytical systems across the Bank.
Request bank to elaborate more on the MDM solution
that is required.
Duplicate Query. Refer to Sr.No. #408
472 23 Data Quality, Point 12 Mechanism to capture feedback from end users to
report Data Quality issues
Request bank to elaborate more on this requirement Refer to Sr.No. #318
473 27 Data Encryption, Point 5 The overall SLA for data processing should be
adhered to, keeping data encryption as an
important activity.
What is the SLA for a data processing job? Please
provide details of all the SLA's that will be applicable
on the new system.
Duplicate Query. Refer to Sr.No. #410
474 27 Data Encryption, Point 6 Proposed Solution should be capable on ingesting
encrypted data from source system. It should
support the encryption/decryption mechanism
implemented at source system.
Request bank to share the encryption/decryption
mechanism implemented at source system.
Duplicate Query. Refer to Sr.No. #411
475 28 Downstream Data Consumption Self-service portal to extract the data on their own
(Should support Data Democratization)
Request bank to elaborate more on this requirement.
Does the vendor need to propose a portal solution for
this requirement?
Duplicate Query. Refer to Sr.No. #412
476 30 Business Intelligence Tools, Point 5 Visualizations: BI tools must provide below
different types of visualizations;
Please provide the use case for vizualizations having
animations.
Duplicate Query. Refer to Sr.No. #413
477 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the use case for reporting on
Multimedia data.
Duplicate Query. Refer to Sr.No. #414
478 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different
criteria defined for each type of user, such as
information consumer, business analyst and IT.
What are the different types of users that will be
accessing the Business Intelligence Solution?
Duplicate Query. Refer to Sr.No. #415
479 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different
criteria defined for each type of user, such as
information consumer, business analyst and IT.
Please provide count of such BI Power Users and BI
Recipient users.
Duplicate Query. Refer to Sr.No. #416
480 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
Request bank to elaborate more on this requirement
with an example.
Duplicate Query. Refer to Sr.No. #417
481 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
What is the peak user concurrency? Duplicate Query. Refer to Sr.No. #418
482 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query
scenarios and compare how each product performs
based on these specific examples. The worse
practice is to just arbitrarily rate the speed.
What is the expected user growth year on year for the
next 5 years?
Duplicate Query. Refer to Sr.No. #419
483 32 Business Intelligence Tools, Point 31 There should be separate criteria for BI user online
help versus technical documentation.
We understand that a user guide needs to be provided
for the business users. Please confirm our
understanding.
Duplicate Query. Refer to Sr.No. #420
484 32 Business Intelligence Tools, Point 31 Ability to handle and summarize huge volumes of
data. E.g. 30-40 million rows accessed on index and
summarized over 5 to 8 metrics.
Please share thedetailed list of SLA applicable for the
new system.
Duplicate Query. Refer to Sr.No. #421
485 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract for SBI INVM
Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)
* 31 (days) = 4212.9 (GB) Monthly File =>
2TB(Approx)
We assume that there are a total of 150 daily files and
1 monthly file that needs to be ingested into the new
system. Please confirm.
Duplicate Query. Refer to Sr.No. #422
486 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract for SBI INVM
Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)
* 31 (days) = 4212.9 (GB) Monthly File =>
2TB(Approx)
What is the normal refresh time for daily batch?
Please provide details about average start time and
end time for various stages of ETL, i.e. Source to
Staging, Staging to EDW, Aggregations, etc.
Duplicate Query. Refer to Sr.No. #423
487 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
We assume that there are a total of 4 file extracts
from the PSG system that needs to be ingested weekly
into the new warehouse. Please confirm
Duplicate Query. Refer to Sr.No. #424
488 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
We assume that there are a total of 4 file extracts
from the SBI Card system that needs to be ingested
weekly into the new warehouse. Please confirm
Duplicate Query. Refer to Sr.No. #425
489 38 Annexure C - Monthly Data
processed in DWH Warehouse
Flat file extract from ATM, INB, PSG, and SBI Card
1.PSG Weekly=>102 MB per week*4=408 MB (Avg
File Size)
2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg
File Size)
3. [~22 GB per day*31=~682 GB]load for file
ATM,INB base]=1147
Please provide the clarity of how the data will be
provided from ATM and INB systems.
Duplicate Query. Refer to Sr.No. #426
490 38 Annexure C - Monthly Data
processed in DWH Warehouse
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Monthly 200 GB(Approx)
How will the data be provided from the mentioned
source systems?
A) Please elaborate on the number of files/tables to
be ingested.
B) What will be the frequency by which the data will
be provided?
C) What is the format of data feeds coming from
various source systems? Is it delimited files Push or
RDBMS data read by ETL tool or any other method?
Please provide details.
Duplicate Query. Refer to Sr.No. #427
491 39 Annexure D - Existing Data
Warehouse Architecture
Database appliance
500+ TB of data with compression index of 2.5
We understand that the total compressed data in the
data warehouse is 500TB. What is the maximum
storage capacity of the current data warehouse
infrastructure?
Duplicate Query. Refer to Sr.No. #428
492 39 Annexure D - Existing Data
Warehouse Architecture
Data Sourcing and Extraction Jobs
17000 Data sourcing and data extraction jobs
What is the breakup of jobs
complexity(Simple/Medium/Complex) wise?
Duplicate Query. Refer to Sr.No. #429
493 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
We understand that this provides the number of
reports being generated from the current DWH and
needs to be migrated to the new system. Please
confirm.
Duplicate Query. Refer to Sr.No. #430
494 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
What is the breakup of reports
complexity(Simple/Medium/Complex) wise?
Duplicate Query. Refer to Sr.No. #431
495 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
Are there any new reports or dashboards that need to
be created in the new system. Please provide the
number of reports and dashboards along with the
complexity (Simple/Medium/Complex) breakup.
Duplicate Query. Refer to Sr.No. #432
496 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
How many years of data has to be there in the
Interactive reports?
Duplicate Query. Refer to Sr.No. #433
497 39 Annexure D - Existing Data
Warehouse Architecture
Reporting
300+ Interactive reports
100+ Busted reports
Is there a requirement for Dashboards? If yes, please
provide the
Duplicate Query. Refer to Sr.No. #434
498 25 Regulatory Reporting, Point 1 Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Is it the requirement of SDMX format reporting? Duplicate Query. Refer to Sr.No. #435
499 25 Regulatory Reporting, Point 5 Data Lineage and Transparency – Tool should
retrace the journey of the source data through
every single workflow processes or calculations
across siloes systems all the way to disclosures.
Request bank to elaborate more on this requirement. Duplicate Query. Refer to Sr.No. #436
500 26 Regulatory Reporting, Point 11 Pre-submission Review – Multiple report writers
should allow users to review reports in various
formats before submission, with the ability to drill
down and make manual adjustments where
necessary.
What type of system is expected for review? Duplicate Query. Refer to Sr.No. #437
501 26 Regulatory Reporting, Point 12 Tool must support generation of reports in XBRL
format
What is the current mechanism being used to
generate XBRL format?
Duplicate Query. Refer to Sr.No. #438
502 22 Data Governance, Point 7 Automated propagation of changes to NEXT-GEN
DW Data Dictionary and business glossary by
multiple sources as and when changes occur in
source.
What level details are expected in Data dictionary and
business gloassary?
Duplicate Query. Refer to Sr.No. #439
503 14 Data Ingestion, point 7 Data sanity checks, automated reject processing,
validations and reconciliation of data should be
available as part of data ingestion solution to
ensure the integrity of data.
Automated reject processing is feasible for known
issues, however unforeseen one need to follow testing
and validation process.
Duplicate Query. Refer to Sr.No. #440
504 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the details of the sources from which
unstructured and semi-structured data will be
ingested.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
505 38 Annexure C - Monthly Data
processed in DWH Warehouse
Source System Details We understand that this Annexure provides the details
of the sources which are being ingested in the current
data warehouse and the same needs to be integrated
into the new system. Please confirm our
understanding.
Please refer to Annexure C for more details
506 38 Annexure C - Monthly Data
processed in DWH Warehouse
Source System Details Are there any new soucre systems that will need to be
ingested into the new system? For each of the source
system please provide the following details
a) Please provide the number of files/tables to be
ingested from the new source systems.
b) Will the data be ingested in real time or batch mode
?
c) What will be the frequency by which the data will
be provided?
d) What is the format of data feeds coming from
various source systems? Is it delimited files Push or
RDBMS data read by ETL tool or any other method?
e) What will be the volume of data that will be
ingested daily?
f) What will be the data volume growth year on year
for the next 5 years?
g) What is the current volume of data in the source
system?
Please refer to Annexure C for more details
507 13 Data Ingestion - Point 1 Capable of ingesting data from any source system
in automated manner currently implemented in the
Bank, or any future standard source systems that
the Bank will decide to use with high throughput
and low latency. Vendor to propose performance
benchmarking for the same.
Please elaborate more "Vendor to propose
performance benchmarking for the same."
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
508 14 Data Ingestion - Point 3 GUI based framework to configure sources to NEXT-
GEN DW
1. Do the exisitng system have GUI framework for base
understanding or reuse it.
2. Or do we need to built altogether new GUI
framework.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
509 14 Data Ingestion - Point 4 Ingestion subsystem should allow to configure
ingestion processes from single / multiple source
system, single / multiple files, single / multiple
operational input files
Is subsystem Ingestion should also be configurable
based on GUI framework
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
510 14 Data Ingestion - Point 8 Existing ETL Jobs to be Fine Tuned. Re-runnablity
checkpoints should be present in ETL jobs. New ETL
jobs should be able to parallel read and write data.
1. How much percentage of ETL Jobs are need to be
fine tuned. Need to understand the existing ETL jobs
performance statistics.
2. Please provide the total no of ETLs and created in
which technology.
3. Please provide statistics on number of existing ETL
jobs and number of BI reports in existing system.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
511 14 Data Ingestion - Point 12 An alerting report and monitoring utility about the
ingest pipelines should be available as part of the
solution.
Are we looking a stand alone monitoring utility and
related reports which will be used for alerting /
notification.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
512 14 Data Ingestion - Point 13 Trigger mechanisms in identifying any structural
changes at source
Please elaborate more This is an industry standard terminology
513 15 Data Ingestion - Point 17 Solution should be able to handle DDL change
without manual reorg/runstat. It should handle
network fluctuations and hindrances.
Please elaborate network fluctuations and hindrances
handling
This is an industry standard terminology
514 16 Data Storage - Point 6 User should be able to work on DB even while
backup is in progress. They should be able to run
statistics and reorganize their tables. Any
background process including backup must not
hamper performance of user queries.
Please elaborate more "They should be able to run
statistics and reorganize their tables. Any background
process including backup must not hamper
performance of user queries"
This is an industry standard terminology
515 17 Data Processing Framework - Point
12
ETL/ELT tool for data extraction should be AI/ML
features for suggesting /
improving Query / ETL / ELT Stages
Please elaborate more his is an industry standard terminology
516 18 Data Federation/Virtualization Point
4
Data virtualization should support the use of APIs. Please elaborate more Proposed solution should have capability of APIs to connect to upstream and
downstream applications
517 18 Migration from Existing Setup to
Proposed Solution Point 3
Data migration from existing archival solution to
new one.
Please elaborate more, also need to understand the
exisitng archival solution
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
518 21 Monitoring Dashboard Point 13 List of daily missing files from source systems Please elaborate more Monitoring dashboard should have capability to automatically flag/highlight
any missed file used for data ingestion from source systems
519 21 Monitoring Dashboard Point 16 Back Dated Data changes needs to be updated on
portal
Please elaborate more Monitoring dashboard should have capability to showcase back-dated data
changes
520 21 Data Governance Point 5 Capability to classify and store (personal
identifiable information) sensitive data in
encrypted /masked form and should have capability
to decrypt/unmask such information in NEXT-GEN
DW when required by only authorized ID’s.
1. Is Bank having the Encryption/Decryption Utility or
Algorithm or Software which can be reused in the
implementation.
2. Or vendor need to develop or use Off-the-shelf tool.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
521 22 Data Governance Point 11 Data modeling capabilities to be provided by the
tool
Please elaborate more This is an industry standard capability
522 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality Currently, how do you check for data quality issues ?
Describe the current process, tools used if any, and
challenges with the current approach.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
523 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality Please describe the type of data quality issues that
you expect to identify in your production data e.g.
presence of NULLs, invalid format of certain fields,
presence of certain unexpected characters/numbers in
the value of field, etc.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
524 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality How data quality related requirements are being
handled currently ? (e.g. custom script, non-disclosure
agreement, in-house solution etc.). What
tools/processes are currently deployed and what are
their challenges / limitations ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
525 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality What is the business nature of the data, on which,
data quality processes such profiling, cleansing,
deduplication, standardization etc are required to be
executed?
E.g. Customer, Product, Vendor, Item, Security, or any
other. Please describe in detail.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
526 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality Is there any requirement of address standardization ?
If yes, please specify, for which geography(ies) of
address standarization of customer is required ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
527 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality How is Master Data Mangement is being cariied out as
of now, Are there any tools being used? If so are there
any requirements which are not being addressed by
that tool?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
528 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality What is the nature of input(s) - Please provide
complete details:
a. Flat/XML files - Please mention its format and
structure
b. Direct connection to source database - please
specify database technology (e.g. Oracle, MS SQL
Server, DB2, etc.)
c. Data extracted from database into flat files - Please
mention structure
d. Unstructured data - Please provide details
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
529 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality Is there any data quality requirement for special
technologies:
a. Data quality in Mainframe environment
b. Data quality in cloud
c. Data quality in Big Data landscape
d. Data quality in packaged applications such as
SAP/ERP/CRM.
If yes, please provide details, so that we can check the
fitment.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
530 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality How the output from Data Quality Management
service is going to be used? What is the target system?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
531 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Quality What is the volume of Data to be processed for data
profiling and data quality management?
Refer Annexure G for sizing.
532 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements : Data Processing
Framework
Is production data copied 'as is' to test regions, for
testing activity ? Do developers/testers currently have
access to actual production data, in the test region?
No. Please refer to Hardware Specification subsection point number #31 on
page #35
533 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements : Data Processing
Framework
If the data is directly copied from product how is the
sensitivity of information is being taken care?
Refer to Sr. No. #532
534 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements : Data Processing
Framework
Does SBI already has some incumbent home-grown or
third party product for Data Masking Requirement?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
535 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Governance:
Data at rest/in-motion should be encrypted
Is there a need for dyanmic data masking for
following usage scenarios:
-- dynamic masking of web application screens
-- dynamic masking of application, Production logs
-- dynamic masking of database query results
-- document redaction (Masking of data in documents)
Please describe the requirement in depth
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
536 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Governance:
Data at rest/in-motion should be encrypted
Are there any special dynamic masking requirements
such as dynamic masking for SAP screens, Mainframe
screens, Thick client application, third party packaged
applications, etc. ? Please provide all the details that
will enable us to check the fitment.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
537 39 Annexure B Technical Criteria/Scope
of Work
Critical Functional Requirements: Data Governance:
Data at rest/in-motion should be encrypted
In case of dynamic masking of web application
screens, will we have access to the web application
server, to deploy the dynamic masking rules on the
web application server ?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
538 11 Eligibility Criteria - Point 2 Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
EOI is SBI specific requirement ask. What does it
mean to have a Next-Gen DW solution. All
components proposed are implemented by bidder as
part of different engagements with multiple clients. Is
this inline with the bank expectations?
Please refer to Corrigendum.
539 13 Technical Criteria/Scope of Work Cost model (How the licensing will be done)- Actual
commercials are not required at this stage
Since the bank is looking at on-prim solution and the
components, hardware , software solution need to SBI
specific, what does the bank mean when asking for
licensing cost model? Is it related to the components
annual AMC cost?
Licensing model means Capex, Opex, PVU based, User based licensing, etc.
540 32 Hardware Specifications Detailing will need time. We request for extension of 2
weeks for submission of response to EOI.
No change in timelines of EOI
541 12 Annexure B - Technical Criteria/Scope
of Work
The Following Specifications for each of PROD, DEV,
UAT and DR environments;
Please share the ratio sizing for Dev environment Please refer Annexure-G. Dev to be 10% of sourced data size.
542 Will SBI's existing EDW and new datalake platform run
parallelly or will there be sunset for existing EDW
platform?
This information is not required at this stage of EOI. Bank will take a final call
at appropriate time.
543 What is the current frequency of EDW refresh from
various sources?
This information is not required at this stage of EOI.
544 What is SBI's current technology stack? Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
545 Is SBI ready for Bidder IP based solution? Bidder may propose any suitable solution. Bank will take a final decision in
the best interest of the Bank
546 1. Apart from the existing Data Warehouse and its
current sources, do any other internal sources of data
exist - Excel sheets/ MS Access databases, application
backend databases, other digitized systems?
2. Would this data be also required to be brought into
the Data Lake?
3. What is the volume of such additional data stores?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
547 Physically where are the current data centers located? This information is not required at this stage of EOI.
548 What tools are currently being used for data masking,
metadata management, web scraping?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
549 Is there any migration expected from existing BI tool
to some other BI tool?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
550 What tools are currently being used for visualization,
dashboards, canned reports, adhoc queries?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
551 Detailing of all components post inputs on queries and
additional information regarding current setup, will
need time. We request for extension of 2 weeks for
submission of response to EOI.
No change in timelines of EOI
552 There is mention of Pilot, demo and POC
interchangeably. What is the bank expectation?
Words pilot and demo are not used in EOI.
553 In case we need to do a Pilot or POC. It will take effort,
which can be reviewed separately and agreed upon
post mutual discussions.
No change in standard clause of EOI
554 8 Terms and Conditions i. Lodgement of an EOI is evidence of an applicant’s
consent to comply with the terms and condition of
Request for EOI process and subsequent bidding
process. If an applicant fails to comply with any of
the terms, its EOI may be summarily rejected.
i. Lodgement of an EOI is evidence of an applicant’s
consent to comply with the terms and condition of
Request for EOI process and subsequent bidding
process. If an applicant fails to comply with any of the
terms, its EOI may be summarily rejected.
No change in standard clause of EOI
555 8 Terms and Conditions ii. Willful misrepresentation of any fact in the EOI
will lead to the disqualification of the applicant
without prejudice to other actions that the Bank
may take. The EOI and the accompanying
documents will become property of SBI. The
applicants shall be deemed to license, and grant all
rights to SBI, to reproduce the whole or any portion
of their product/solution for evaluation, to disclose
the contents of submission to other applicants and
to disclose and/ or use the contents of submission
as the basis for EOI process.
ii. Willful misrepresentation of any fact in the EOI will
lead to the disqualification of the applicant without
prejudice to other actions that the Bank may take. The
EOI and the accompanying documents will become
property of SBI. The applicants shall be deemed to
license, and grant all rights to SBI, to reproduce the
whole or any portion of their product/solution for
evaluation, to disclose the contents of submission to
other applicants and to disclose and/ or use the
contents of submission as the basis for EOI process.
No change in standard clause of EOI
556 32-33 Hardware Specifications 8. The proposed hardware must not fall into ‘End of
Support’ for at least 7 years from the date of
delivery to the Bank.
12. The Vendor is required to supply, install, test,
commission, monitor, manage and maintain the IT
System along with operating system and other
peripherals with one-year warranty and AMC for 4
years from the date of delivery at data centers
advised by the Bank
18. The proposed hardware is mission critical for
the proposed project and support of 24 X 7 with an
uptime of 99.99 % to be ensured by providing
support at PR, and DR site for a period of 5 years.
22. The Hardware solution must be compatible to
integrate with various systems in the Bank
including but not limited to SOC, PIMS, NOC,
Command Centre, ITAM, Service Desk, ADS, and
SSO etc. at no extra cost. Vendor will have to give
appropriate support to the Bank during integration
with various components of IT environment.
8. The proposed hardware must not fall into ‘End of
Support’ for at least 7 years from the date of delivery
to the Bank
12. The Vendor is required to supply, install, test,
commission, monitor, manage and maintain the IT
System along with operating system and other
peripherals with one-year warranty and AMC for 4
years from the date of delivery at data centers advised
by the Bank
The scope of the warranty shall be limited only to
correction of any bugs that were left undetected
during acceptance testing by the Bank. Warranty shall
not cover any enhancements or changes in the
application software, carried out after acceptance
testing. This warranty is only valid for defects against
approved Specifications. The above mentioned
warranty shall also not apply if there is any (i)
combination, operation, or use of some or all of the
deliverables or any modification thereof furnished
hereunder with information, software, specifications,
instructions, data, or materials not approved by
Vendor and operation of the deliverables on
incompatible hardware not recommended by Vendor;
(ii) any change, not made by Vendor, to some or all of
the deliverables; or (iii) if the deliverables have been
tampered with, altered or modified by the Bank
No change in standard clause of EOI
557 8 TnCs (i) Lodgement of an EOI is evidence of an applicant’s
consent to comply with the terms and condition of
Request for EOI process and subsequent bidding
process
Lodgement of an EOI is evidence of EIT’s consent to
comply with the terms and condition of Request for
EOI process and subsequent bidding process shall be
based on an issue of RFP by Bank & EIT submitting the
Bid response with assumptions if any
No change in standard clause of EOI
558 11 Annex A (2) Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Please validate that vendor is expected to be an SI
who would propose fitting solutions from OEMs
Please refer to Corrigendum.
559 12 Annex B (End State objectives) Scope bullets Please clarify if any of these functionalities can be
used using existing solution in bank. In case yes,
please provide detail of existing licenses detail
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
560 12 Annex B (End State objectives) Vendor may propose one or multiple solutions to
meet the scope of work of this EOI
Please validate that vendors can propose multiple
solutions for each objective mentioned or only one
solution can be proposed for each objective and in the
process no. of OEMs can be more than one. From
subsequent lines, it seems multiple solution options
can be proposed on each objective
Please refer to Corrigendum.
561 14 Functional requirements (7) Data sanity checks, automated reject processing,
validations and reconciliation of data should be
available as part of data ingestion solution to
ensure the integrity of data
Please validate that only validation and reconciliation
reports and limited transformation changes at ETL
level would be expected. Changes required at core
systems end for reconciliation of business figures
would be done by those systems basis validation
reports generated during ETL to new DW. Second last
bullet under 'End State Objectives' on pg 12 indicates a
reconciliation solution is expected (perhaps bank had
purchased GL recon solution as part of OFSAA)
The reconciliation proposed is in respect of reconciliation of data between
Next-Gen DW and upstream/downstream applications as well as within Next-
Gen DW ecosystem
562 15 Data Storage (2) Vendor should propose the type of storage to opt
for in NEXT-GEN DW (SQL, NO-SQL, etc) and
provide details of the hardware requirements,
supported open source / proprietary components
Please clarify if open source solutions can be proposed
as part of total solution of we only need to give details
of open source solutions supported by proposed
proprietary solutions
Please refer to EOI clauses for more details
563 19 Migration from Existing Setup to
Proposed Solution (12)
Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years
To assess re-use of existing infrastructure, complete
details of existing HW and SW would be needed in
Annex D (e.g. type, quantity, version etc.)
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
564 20 Data Archival and Backup Data Archival and Backup Please clarify that new archival and back up solutions
are expected
Yes
565 28 Downstream Data Consumption Facility to generate and distribute canned /
automatic bursted reports from NEXT- GEN DW to
downstream end users like BID, Analytics, CRM,
YONO, OFSAA, etc
Please clarify that next gen DW will provide data to
downstreams like OFSAA. Reports would be generated
out of downstream applications
This information is not relevant to EOI.
566 General Please clarify if Next gen DW will be set up only for SBI
India operations or it will include foreign operations
and subsidiaries also
Current scope of work covers both Domestic and International operations of
the State Bank of India (SBI) Group including subsidiaries also.
567 General Please clarify that existing DW SI / other SIs in SBI will
have any advantage of existing licenses
This information is not required at this stage of EOI. Bank will take a final call
at appropriate time.
568 11 Annexure A - Eligibility Criteria
S.No. 3
The solution should have been
implemented in at least 2 large
scale organizations.
There would possibly be multiple solution components
to meet this EoI requirement. Please clarify if all the
solution components need to have 2 references or
only the main solution components are expected to
have 2 references
Please refer to Corrigendum.
569 13
17
Critical Functional Requirements-
Data Ingestion
Performance benchmark for Next Gen DWH
Capable of ingesting data from any source system
in automated manner currently implemented in the
Bank, or any future standard source systems that
the Bank will decide to use with high throughput
and low latency. Vendor to propose
performance benchmarking for the same.
Performance benchmark of all components of Next-
Gen DW to be given by participating Vendors
Please clarify on the details of the benchmarking so
that this is standard across bidders
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
570 25 7 Electronic Submission – Should support for all
regulators globally in all required
formats, including XBRL, XML or other file-based
electronic submission.
Is there an existing SBI XBRL reporting solution which
can be leveraged here?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
571 24 Security and Compliance Authentication and Identity management Can we leverage the existing SBI SOC/security
investment?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
572 26 Data encryption Can we leverage the existing SBI SOC/security
investment?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
573 27 User Management Can we leverage the existing SBI SOC/security
investment?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
574 18 19 The workflows should work with standard
schedulers. Monitoring and management of
workflows should be possible from an easy to use
interface.
Workflow management tool(s) should have
connectors / pluggable interfaces to already
existing / in-use proprietary software available with
the Bank. These could be (and not restricted to)
data repositories, reporting tools, data analysis
tools and generic interfaces for data transfer.
Scheduled jobs status should be made available to
the Bank in Monitoring dashboard on real time
basis.
Please share some detail on the existing schedulers
used at SBI DW
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
575 13 19 Tentative Project Timeline Can the bank please specify total implementation
duration ( in months) planned for implementing the
next gen DW and Business priorities which can be
taken up in different phases including retirement of
existing infrastructure
Participating Bidders are expected to propose high level timelines for this
project as clearly mentioned on page number # 13 of this EOI.
576 18 Data Federation/Virtualization Vendor to propose a solution / tool (s) for Data
Federation/Virtualization to ensure seamless
integration of data in real time when stored in
multiple sources without
physical movement of data sets for the purpose of
reporting / analytics
Is there a federation expected between the existing
and next gen DW? What is the likely duration ( no. of
months) for which the federation is expected
Yes, as per the migration timeline proposed by the Bidder.
Virtualization/Federation is also expected in between different components
of Next-Gen DW along with source systems during the entire period of the
project
577 25 Regulatory reporting Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Please clarify if bank would continue to use its ADF/
regulatory reporting solution to provide element
based data to RBI or does bank visualise doing
element based reporting from next gen DW
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
578 28 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Implementing end to end analytics use-cases as
mandated by the Bank
What is the scope of the term "end-to-end analytics"? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
579 28 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS). Migration of
such models to new solution
What are existing analytical models/ algorithms in IBM
SPSS that need to be migrated?
This information is not required at this stage of EOI.
580 28 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Availability Pre-build models which can be directly
used with Bank’s data to get
insights
What are the use cases where the Bank is looking to
use such pre-built models?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
581 29 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Analytics on real-time data in real-time/near real-
time
What are the real-time or near-real-time uses that are
being envisaged?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
582 29 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Vendor to provide solution / tool (s) for below
scope of activities on SBI data sets -
· Social Media Analytics
· Web Analytics
Web Analytics if understood broadly is analysing how
visitors on a site behave, might mean implementing a
Web analytics tool (Adobe Analytics, for example) to
monitor and evaluate traffic to a site. This would
require several tasks of tagging pages, classifying
them, besides web page optimization using A/B
testing etc. This apart, it also includes applying
Advanced Analytics and Machine Learning on batch log
data. The latter will require the use of a Data Science/
Machine Learning platform. In what sense is the Web
Analytics term being used here? A similar clarification
is required for use of Social Media analytics too.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
583 29 Annexure B - Technical Criteria/Scope
of Work. Critical Functional
Requirements, "Data Science
Platform with AI/ML Capabilities"
Annexure E gives sample use cases which are to be
implemented on Next Gen Data Warehouse using
structured and/or unstructured and/or semi-
structured and/or any other kind of data gathered
from either Data Warehouse or Data Lake or Data
Virtualization or all together or any other source.
Is Annexure E just a sample list of use cases that need
to be built or does it represent an exhaustive set of all
areas where models need to be developed?
These are sample use cases build for execution on Next Gen DWH platform
over the period of time for Analytical studies. Bank at its own discretion will
implement new models/use cases on this setup in future.
584 40 Annexure E - Functional Use Cases In general How many of the use cases in Annexure E are already
implemented in the existing DWH and only need to be
migrated and how many of them need to be
implemented from scratch in the New Solution?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
585 40 Annexure E - Functional Use Cases In general Is the expectation to deliver these 32 use cases for the
entire Bank, across its entire customer base and across
all its Businesses? If Model training for these use cases
needs to happen by each customer segment, or
Business (or other groupings) separately, how many
models in total would it translate into?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
586 40 Annexure E - Functional Use Cases In general What are the expected deliverables for the 32 use
cases?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
587 40 Annexure E - Functional Use Cases In general What are the expected timelines / priority of delivery
for all the 32 use cases? Need to help us suggest total
implementation timeline on Next-gen DW
Participating Bidders are expected to propose timelines for this project as
clearly mentioned on page number # 13 of this EOI.
588 13 2 Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Is the NOSQL data mentioned here JSON/XML ? What
kind of processing on NOSQL is expected ? Is there a
need to join the NOSQL data with other relational data
? Is there a need to shred the NOSQL data into
relational data ?
Duplicate Query. Refer to Sr.No. #189
589 16 11 Data Storage
The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Will all these queries be executing in-flight at the
same time ? Or will these be initiated over a period of
time and the aggregate number of queries run during
that time period will be 600,000 ?
Duplicate Query. Refer to Sr.No. #192
590 15 1 Data Storage
Vendor should propose effective number of data
storage layers in NEXT-GEN DW between data
ingestion and data consumption.
In the existing DW solution, are surrogate keys used?
If yes, is there a framework for storage and
management of the keys to ensure robustness of the
data warehouse?
Duplicate Query. Refer to Sr.No. #191
591 16 13 It should be possible to project and view data
through multiple modes using the Storage on NEXT-
GEN DW. Varieties of GUIs should be available to
project or view the output generated through
analytic processes. For instance: The Bank may
decide to implement use-cases that project
transactions data as a graph data structure. The
Storage solution on NEXT-GEN DW should allow for
such projections.
Do we need to enable graph based analytics as well or
it should be only limited to the facility to access the
data for graph based analytics.
Duplicate Query. Refer to Sr.No. #193
592 17 2 Data Processing Framework
The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the expected latency requirements for real
time processing? The solution may based on use case
if requirement is for immediate processing vs latency
of upto 5-10 minutes.
Duplicate Query. Refer to Sr.No. #195
593 17 9 Automatic recovery of data after failure/rejection
of record needs to happen without any manual
intervention
Is there any specific treatment that needs to be
performed for rejected records?
Duplicate Query. Refer to Sr.No. #199
594 17 14 Data Processing Framework
Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real time mode every day.
Are bulk of the data transformation jobs expected to
be triggered during non-business hours, when user
reporting or other workloads are at a minimum? Are
there going to be users across multiple time zones or
large part of the user base will be within a single time
zone?
Duplicate Query. Refer to Sr.No. #201
595 18 2 Migration from Existing Setup to Proposed Solution
Data migration from Staging and Data Marts, user
tables and any other schemas identified by Bank.
Along with the Staging and Data Mart objects, is there
any integrated data layer in the existing solution? If
yes, then is it built using any proprietary data model?
Duplicate Query. Refer to Sr.No. #206
596 29 7 Data Science Platform with AI/ML Capabilities
In-memory computing & integration with Spark,
Redis, etc
What kind of analytic processing is expected on Spark,
Redis, etc. ? Is this using Spark-ML, for example ?
Duplicate Query. Refer to Sr.No. #207
597 35 33 Hardware Specifications
Next-Gen DW should support at-least 500
concurrent users, scalable up to 1000 users in next
5 years, running ETL/ELT jobs or doing ad-hoc data
extraction requests on database (Not including API
based access or scheduled job connections to
database)
What is the nature and expected concurrency of API
based access ? What is nature and concurrency of
scheduled job connections - are these ETL, or
maintenance related connections ?
Will the 1000 concurrent users be expected to be
running queries simultaneously, or is this just 1000
concurrent logons ?
Duplicate Query. Refer to Sr.No. #211
598 36 37 Hardware Specifications
Ad-hoc jobs of any complexity should not hamper
the scheduled jobs performance.
What is expected mix of queries in terms of tactical
(very short), medium, long running (reports), batch
loads, near real-time and real-time loads running
simultaneously on the system ?
Duplicate Query. Refer to Sr.No. #212
599 19
Migration from Existing Setup to
Proposed Solution
Vendor to review the existing architecture during
migration and remove duplication of data and
recommend improvements in overall setup if any
What deduplication rules have been defined by bank Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
600 13 Scope of Work Team structure (without actual profiles) Does the bank envisage OEM involvement in
implementing core OEM related services via SI
Yes, please refer to point number # 16, under Hardware Specification on page
number # 34 in EOI for more details.
601 13 Critical Functional Requirements -
Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Please provide ratio of split of Structured: Semi-
Structured; Unstructured (Images): unstructured
(Videos) data. This helps in solutioning
Duplicate Query. Refer to Sr.No. #224
602 25, 39 Critical Functional Requirements -
Regulatory Reporting
Automation –Tool should automate analytics and
reporting workflow end-to-end, including all data
collection, enrichment, and management, as well as
all calculations, processes to final report
submission. Currently 500+ jobs are being used for
Tranche 1 DCT generation along with 500 more for
other regulatory reports/returns.
Please provide the number of returns/reports for
regulatory body over and above the ones listed in
Annexure E
Duplicate Query. Refer to Sr.No. #227
603 11 2 Eligiibility Criteria
Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
Does the word 'Next-GEN DW' solution means solution
comprising of Data Lake & Data WareHouse. Also does
the 'Next-GEN DW' solution refers procurement of
hardware & software application ?
Will the word 'Next -GEN DW' solution be part of the
overall contract to be signed ?
Word Next Gen DW refers to solution(s) fulfilling all the requirements given
by the Bank in this EOI. Purpose of EOI is clearly mentioned as -> Please note,
the objective of this Request for EOI is to identify all possible solution (s) for
the scope of work defined in this document.
604 14 3 Data Ingestion:
GUI based framework to configure sources to NEXT-
GEN DW
Please elaborate the expectation. Does the
configuration requires addition/modification
1. Add/Delete Source System
2. Add/Delete Source System Tables, Manual Files
3. Add/Delete Metadata Reconciliation information,
etc.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
605 14 8 Data Ingestion:
Existing ETL Jobs to be Fine Tuned. Re-runnablity
checkpoints should be present in ETL jobs. New ETL
jobs should be able to parallel read and write data.
Please confirm the timeframe for parallel run and trust
is parallel write of exisitng and new ETL will be on
different servers
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
606 18 18 Data Processing Framework
Transformations for this activity can be categorized
into the following types:
· Existing transformations in DWH that needs to
be migrated to NEXT-GEN DW
· New transformations for data sources that are
not sourced by DWH
· Transformations and data processing pipelines
for real time data capture
Are there any realtime transformation happening in
current setup ? If yes, please share no. of real time
transformartion happening and also give an example
for the same.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
607 18 1 Migration from Existing Setup to Proposed
Solution:
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
What is the current technology set-up ? How is the
reconciliation done and what is the degree of
correctness acheived in the current system?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
608 18 3 Migration from Existing Setup to Proposed
Solution:
Data migration from existing archival solution to
new one.
Currently, archival solution holds data for how many
years ? How many years of data bank is looking to
archive in new DWH ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
609 20 5 Data Archival and Backup :
Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Tape backups are taken at what frequency. Will the
same frequency persist for the New DWH ? This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
610 20 1 Cloud Integration and Migration :
NEXT-GEN DW should be able to consume data
from external cloud-based infrastructures.
Are there any restrictions on fetching data from
external clouds ? If yes, How will data consumption
happen - DB connectivity/Manual Files? Where is data
center located for external clouds ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the RBI
guidelines on the same
611 20 5 Cloud Integration and Migration :
In view of the intent to reduce the hardware
footprint (in future), the technical architecture of
NEXT-GEN DW solution should be flexible to
accommodate provisioning of NEXT-GEN DW on
cloud. The Bank understands that there can be
differences in services offered by cloud service
providers. The NEXT-GEN DW solution architecture
should be designed considering as-is infrastructure
availability in cloud.
Share details of the current cloud set-up/architecture This information is not required at this stage of EOI.
612 25 3 Regulatory Reporting:
Change Management –Tool for handling ongoing
change in regulation or business requirements
without the need for programming expertise. On
and average logic for 5% of jobs being changed
monthly. Data used for regulatory reporting
changes on any frequency like daily / weekly / bi-
weekly / monthly, etc
Which is the current tool used for Change
Management ?
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
613 26 11 Regulatory Reporting:
Pre-submission Review – Multiple report writers
should allow users to review reports in various
formats before submission, with the ability to drill
down and make manual adjustments where
necessary.
How is the current manual adjustment is done - at DB
Level or in Reports ?
This information is not required at this stage of EOI.
614 27 1 User Management:
Vendor should propose automated solution / tool
(s) of User Access Management (UAM) for
administration of giving access to individual users
within a system access to the tools they need at
the right time.
How are users currently accesing the DWH, using only
reports or have read access on the DB or are they
connecting to the DWH using some third party tool.
Please list down all the tools
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
615 28 2 Data Science Platform with AI/ML Capabilities:
Power data / objects to existing analytics models
built on proprietary tools (IBM SPSS). Migration of
such models to new solution
Total No. of Models to be Migrated ? This information is not required at this stage of EOI.
616 12 and 13 End state objectives End state objectives Kindly confirm if there will be separate RFP or tender
for hardware and software (Data Warehouse, Data
Marts, Data Lake, etc) or if there will be single RFP and
bidder would be required to propose the hardware as
per their respective sizings. We are referring to the Big
Data Lake RFP released last year (March / April 2018)
where separate RFP for hardware and software was
floated
This information is not required at this stage of EOI.
617 Page 32 and
Page 34
Hardware Specifications Hardware Specifications The EOI envisages use of commodity hardware (page
32) but also mentions uptime of 99.99 % (page 34).
Please note, the overall availability would be built into
the architecture design and the SLAs confirmed basis
the overall design. Kindly confirm if this is correct
understanding
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
618 Page 32 Hardware Specifications Not Available (we request for additional
qualification criteria)
Request SBI to restrict underlying hardware providers
to vendors with proven reference architectures for
setting up data warehouse / data mart / data lake kind
of solutions. The reference architectures should be
published on the vendors website. Request SBI to
further consider vendors with reference architectures
using last two generations of Intel CPUs.
No change
619 Page 32 Hardware Specifications Not Available (we request for additional
qualification criteria)
Request SBI to restrict hardware vendors to consider
the top vendors in terms of market share (both
revenue and units). Reports published by
organizations such as Gartner and IDC could be used
to ascertain the top vendors
No change
620 15 3.Data Storage A multi-temperature data management solution to
be proposed by vendor where data that is
frequently accessed on fast storage—hot
data—compared to less-frequently accessed data
stored on slightly slower storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should also be capable
automated storage tiering and seamless data
transfer between hot, warm and cold storage. Data
residing in any of these storage areas must be
seamlessly mixed / merged according to
requirements without impacting performance.
Request Bank to not make this clause of three tier as
mandatory as OEM may also prefer to give an all flash
solution or a two tier Architecture based on the
solution Performance requirement. Also please
suggest on the following 7 points :-
1. What are the possible applications or data analytics
engine customer is considering for AI/ML data
warehouse project? ( to check if we have ISV
partnership with the AI/ML stacks to be considered)
would this be real time analytics, cold or hot analytics
or a combination of both?
2. What is the approximate data size for migration
from old set up?
3. What are the applications and protocols involved in
data that needs to be migrated?
4. What would be average retention of data on active
tier?
5. What protocols need to be considered for the data
warehouse project?
6. What OS flavor servers would be involved for data
processing?
7. Is customer willing for ready stacks for AI we have in
our offerings? (AI ready stack, end to end ready
solution for AI/ML use cases with networking,
compute, storage and racks in partnership with Nvdia)
No change in standard clause of EOI
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
621 20 1.Data Archival and Backup Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Recommend Bank to consider On-Premise Cloud based
Archive solution instead of Tape Based traditional
method to bring down TCO and consider modern
technology concepts
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
622 39 Annexure D General Please provide the existing EDW techinical
architecture along with the tools used (ETL, DQ, DW,
BI and their versions) which will help us to integrate it
with proposed Data Lake, also provide the list of
downstream systems being used in the exisiting DWH.
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
623 12 Annexure B Migration from existing setup to proposed solution Does the bank intend to use the existing DW for
sometime post the migration activities , Or will it be a
complete Sun set after migration.
Bank will take a final call at appropriate time.
624 12 Annexure B Framework for regulatory reporting Does this include implementation of ADF (Automated
Data Flow).
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
625 13 Annexure B Does the bank have any tentative project timeline in
mind.
Participating Bidders are expected to propose high level timelines for this
project as clearly mentioned on page number # 13 of this EOI.
626 13 Annexure B Is the bank open to use completely Open Source
technologies and commodity hardware.
Please refer to EOI clauses for more details
627 13 Annexure B Performance Benchmarks Will there be any 3rd party involved to conduct the
performance benchmarks. Does the bank have any
performance parameters listed.
Bank may take a call to involve a 3rd party to conduct/test performance
benchmark in future.
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
628 39 Annexure D Apart from Reports/ETL , Are there any existing
stastical models to be migrated into the new setup.
Please refer to EOI
629 20 Annexure B Cloud Integration Does bank want entire Next Gen setup in
public/private cloud (if required)or just cloud
integration (e.g. To process external data and bring
only processed data to its Nextgen DW for analysis)
At present, as per the Bank's IS policy migrating/storing data in public cloud is
not permitted. However bidders may propose, as an alternative, use of cloud
(public cloud, private cloud, on-premise etc) in addition to the best integrated
proposed solution.
630 20 Annexure B Transfer out of NEXT-GEN DW to public cloud
should not be possible by all roles in NEXT-GEN
DW. All activities with data transfer from Public
cloud should be logged for audit and monitoring.
It is interpreted that data will only flow INTO the Next
gen DW from the public clouds and never move out of
the Next gen DW to any of the public sites. Is this
understanding correct?
At present, as per the Bank's IS policy migrating/storing data in public cloud is
not permitted. However bidders may propose, as an alternative, use of cloud
(public cloud, private cloud, on-premise etc) in addition to the best integrated
proposed solution.
631 20 Annexure B General How many Analytics users / Data Scientists will likely
to access the system? What is the concurrency ?
Please refer Hardware Specification subsection, point number #34 on page
#35
632 39 Annexure D ETL Please share the details of no of ETL jobs to be
migrated into NextGen DW, by complexity. Where :
1.Very Simple = 4 Transformations
2. Simple = 6 Transformations
3. Medium = 10 Transformations
4. Complex = 15 Transformations
5. Very Complex > 15 Transformations
This information is not required at this stage of EOI.
633 20 Annexure B General Is there any requirement for Post Go live Support. If
yes , then please specify the duration of the support
period.
Required information is clearly given in EOI on page number # 33, point
number 12.
634 20 Annexure B General For 'in memory processing' - what will be the use cases
,volume and duration of data to be considered ?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
635 13 Annexure B Performance Benchmarks Will the benchmarking be done with data encryption
and masking features.
Yes,
Bidder to provide details of performance benchmarking to enable us to take a
holistic and comprehensive view of the architecture in formulating next
course of action
636 20 Annexure B Data Masking Please explain what level of masking is expected?
Is data expected to be masked in Production and
stored in masked form? OR Is data in non-prod also
expected to be masked.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
637 13
specification
Please clarify what is meant by internal and external
specification.
Internal network specifications means within the different components of
Next-Gen DW and external network specification means network
requirements between Next-Gen DW and other applications in the Bank
638 34 15 The Vendor shall ensure all Installations &
Implementation to be done by OEM badged
resources only
Request the Bank to modify this to OEM/Bidder
resources.
No change in standard clause of EOI
639 15 Architecture diagram Deployment plan - Vendor to
submit architecture diagram of entire setup with
network and security equipment required. Bank
may change it after vetting by Information Security
Dept and / or Enterprise architecture Dept. It will
be binding on vendor.
Will this setup be completely green field? Is the bidder
expected to supply the network and security
equipment?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
640 34 18 The proposed hardware is mission critical for the
proposed project and support of 24 X 7 with an
uptime of 99.99 % to be ensured by providing
support at PR, and DR site for a period of 5 years.
Is bidder expected to provide onsite resources for
every domain? Please clarify.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
641 35 22 The Hardware solution must be compatible to
integrate with various systems in the Bank
including but not limited to SOC, PIMS, NOC,
Command Centre, ITAM, Service Desk, ADS, and
SSO etc. at no extra cost. Vendor will have to give
appropriate support to the Bank during integration
with various components of IT environment.
Please clarify what integration is expected? Proposed solution by Bidder should able to work with existing tools
mentioned in the clause
642 24 Technical Criteria/Scope of Work Data protection, Data security, Data privacy Whether client is expecting PII and SPI need to be
masked and encrypted in proposed DWH
environment.Whether bidder can leverage client
existing Masking and Encryption mechanism
Yes, Bidders are expected to propose solution(s) for encryption and masking
of PII and SPI data.
643 24 Technical Criteria/Scope of Work Authentication and Identity Management - A
comprehensive identity and access management
system should be available for centralized
management of users and groups. It should be
possible to quickly create and revoke the identity of
a user or a service by simply deleting or disabling
the account in the directory. Multi-factor
authentication is desired as an additional layer of
security for user sign-in and transactions
Whether client is expecting Data WareHouse solution
need to be integerated with IDAM solution, Please
provide the no of Users count or bidder can leverage
client existing IDAM solution to be integerated with
DWH
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
644 25 Technical Criteria/Scope of Work Compliance to Global Standards - GDPR, BCBS239,
PCIDSS, DFRA and similar relevant standards
So whether client PII and SPI information is going to
travered from European territory to a centralized DWH
site in India. As per the GDPR standards
Organization should document personel data they
hold, where it came from and whom do they share it
with.
Yes, proposed solution(s) should be GDPR compliant.
645 25 Technical Criteria/Scope of Work Data Leakage - Security CIA parameters should be
achieved, and tools should be able to find and alert
on Data leakage
Whether client is expecting data leak prevention from
proposed Data Warehouse setup. Whether bidder can
leverage client existing DLP solution for securing the
data in rest and in motion.
Yes, Bidders are expected to propose solution(s) for DLP
646 25 Technical Criteria/Scope of Work Vendor should propose automated solution / tool
(s) of User Access Management (UAM) for
administration of giving access to individual users
within a system access to the tools they need at
the right time
Our understanding is proposed solution must be
integerated with PAM solution. Whether bidder can
leverage client existing PAM solution
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
647 General Please provide the additaional secuirty controls that
bidder need to proposed.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
648 33 Hardware Specifications General What are the level of details required in the Hardware
Specifications. Please specify OR share any format that
the bank may have.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
649 19 Disaster Recovery The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
Are there any requirements for RPO Bidders to provide their best RPO for solution (s) proposed.
650 33 Hardware Specifications Vendor must ensure that the proposed servers are
fault-resilient with the most comprehensive
features and functionalities to ensure maximum
system uptime.
Can you please specify the details for ACTIVE-PASSIVE
requirements for DC/DR
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
651 36 Hardware Specifications The Vendor also needs to provide the configuration
for setting up of functional DR along with DEV and
UAT for each and every component / application of
Next Gen DW ecosystem.
Can you please specify the exact scope/coverage of
"Functional DR"
Please refer to subsection Disaster Recovery on page no #19 of EOI document
652 51 Hardware Specifications Vendor to submit all back-to-back agreement
copies between Vendor and SI / OEM / Parent
company etc if any and tenure of the back-to-back
agreement should be same as selected Vendor’s
agreement with the Bank
Would request the bank to keep this requirement at
the RFP stage.
Please refer to Corrigendum.
653 11 Annexure A 2. Vendor should have existing Next-Gen Data
Warehouse solution as mentioned in the EOI
3. The solution should have been implemented in
at least 2 large scale organizations.
The Annexure B reference is too open, Pl help with the
refined proposition. Sample below e.g. :
• Implementation of Data warehouse in Government/
Banking/ Telecom including the below criteria:
- Integration of Source Systems
- Integration of Reference Dimension Data
- Reports / Downstream Data reports
- Creation of Daily/Monthly Aggregates
- Historical Data Migration
- Near Real Time Platform including Data Integration &
Hadoop.
- 5+ Petabyte of data warehouse
- Near Real Time Data Availability Platform
- Daily data ingest of 5+ TB
- 100 + Hadoop Nodes for processing the data
- The approximate value of the project should be 20 +
Cr (INR)
Please refer to Corrigendum.
654 12 Annexure B EOI Proposal should include following items for
each proposed option
The Following Specifications for each of PROD, DEV,
UAT and DR environments;
Addition the Performance Testing and Pre-Prod
Environment should also be included
No change in the requirements of EOI
655 13 Annexure B - Critical Functional
Requirements
2 - Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
unstructured Data Should have use case identified, if
this is for storage only the solution may be different
from the Processing of the unstructured data.
Please refer to Annexure E for sample use cases
656 14 Annexure B - Critical Functional
Requirements
3 - GUI based framework to configure sources to
NEXT-GEN DW
can we pl elobarate this requirement Refer to Sr. No. #604
657 14 Annexure B - Critical Functional
Requirements
8 - Existing ETL Jobs to be Fine Tuned. Re-
runnablity checkpoints should be present in ETL
jobs. New ETL jobs should be able to parallel read
and write data.
What is the Current ETL Tool with Version for
Finetuning. Approach will be suggested accordingly
Currently we are using IBM Stack. Bidders are free to propose solution(s) in
the best interest of the Bank to meet the requirements given in the EOI.
658 14 Annexure B - Critical Functional
Requirements
Tools used for Data Ingestion should be platform
and database independent and should be
compatible to ingest and replicate data on parallel
processing
Will open source stack be allowed? Please refer to EOI for details
659 14 Annexure B - Critical Functional
Requirements
End objective for the data ingestion is to publish
the dashboards for end users or any job related to
reporting and analytics max by 8.00am on next
business day
What is the time of Close of Business processing?
Objective is to determine the window SI gets for
report availability at 8am.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
660 14 Annexure B - Critical Functional
Requirements
Trigger mechanisms in identifying any structural
changes at source
Will Source systems allow access to such logs? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
661 14 Annexure B - Critical Functional
Requirements
The vendor should be able to design solutions to
handle data volumes and complexity in source data
with decompression logic wherever required.
Source system should allow de-compression process
to run the same server to ensure service compatibility.
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
662 16 Annexure B - Critical Functional
Requirements
11- The storage system should be robust to handle
at least 1,50,000 concurrent queries (Select/DML)
by processing engines / ETL jobs / end users
scalable up to 6,00,000 concurrent queries in next 5
years (assuming parallelism of 100 degree).
What is the Sizing of the Existing Infrasturcture and
Data Storage Platform
Refer Annexure G for sizing. Currently we are using IBM Stack. Bidders are
free to propose solution(s) in the best interest of the Bank to meet the
requirements given in the EOI.
663 16 Annexure B - Critical Functional
Requirements
11- The storage system should be robust to handle
at least 1,50,000 concurrent queries (Select/DML)
by processing engines / ETL jobs / end users
scalable up to 6,00,000 concurrent queries in next 5
years (assuming parallelism of 100 degree).
What is the distribution of users for:
1) Within intranet - application/ report users
2) From internet
3) Sandbox
This information is not required at this stage of EOI.
664 16 Annexure B - Critical Functional
Requirements
12 - Downstream departments (data consumer) to
be given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases. Construction of this
separate Database snapshot and enabling this audit
trails must not cause any major systemic
issues/challenges in smooth functioning of primary
DB.
Does SBI Looking for Seprate DB Snapshot with
seprate Queue as well for the specific users within
Wharehouse
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
665 16 Annexure B - Critical Functional
Requirements
13 - It should be possible to project and view data
through multiple modes using the Storage on NEXT-
GEN DW. Varieties of GUIs should be available to
project or view the output generated through
analytic processes. For instance: The Bank may
decide to implement use-cases that project
transactions data as a graph data structure. The
Storage solution on NEXT-GEN DW should allow for
such projections.
graph data structure can be provided in RDBMS as well
as Other structures like Graph DB which stores the
data in XML/JSON. Has SBI have any specific thoughts
around the same
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
666 17 Annexure B - Critical Functional
Requirements
Framework should have mechanism to protect data
at rest and at motion from unauthorized user
access and amendments.
Will the solution need its own access provisionoing
framework or it can integrate with Bank's existing
authentication framework?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
667 17 Annexure B - Critical Functional
Requirements
Data Processing Framework This has Overlaps with the Data Ingestion, Any specific
Need to have this seprately?
Data ingestion belongs to data sourcing requirements and Data Processing
framework covers scope of internal data processing in new set up and data
extraction for downstream departments & users.
668 17 Annexure B - Critical Functional
Requirements
Migration from Existing Setup to Proposed Solution 1)Please list down the sources of structured/
unstructured/SemiStructured data and its volumetric
and growth. Also if external sources please list them
2)Are any of the data sources anticipated in near
future? For current implementation do we need to
estimate any future sources? Please confirm
3)Are there any requirement for data marts to be
created? If so please provide the functional area with
number of data marts required.
"4)""For the ODS/data warehouse process, Please
provide the following source details w.r.g to structure
data
a. List/Number/Types of databases,
b. List/Number of physical tables
c. List of source Files for integration,
d. One time/daily load Volumetric and data growth
e. Batch processing latency etc."""
"5)SI assume that all the source system applications
will be able to provide the historical data from sources
in readable format
to be loaded into the DWH as files, please confirm (for
structured data)"
"6)""If previous assumption is incorrect then:
1. Sufficient information required at this stage for sources, sizing is given in
EOI
2. Please refer to Annexure C for Sizing of Data for more details
3. This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
4. This information is not required at this stage of EOI.
5. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
6. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
7. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
8. Please refer to sub section Data Ingestion on page number # 13 for details.
9. This information is not required at this stage of EOI.
10. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI. Bank will take a final call at
appropriate time.
11. Refer to Annexure C. Bidders are free to propose solution(s) in the best
interest of the Bank to meet the requirements given in the EOI.
12. This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
13. Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
14)Refer to Annexure C, D in EOI
15)This information is not required at this stage of EOI.
"16)""This information is not required at this stage of EOI. Bidders are free to
669 18 Annexure B - Critical Functional
Requirements
Data Federation/Virtualization -3
Semantic integration of structured & unstructured
Data.
Kindly help us with the Use case for the unstructured
Data within Federation layer, this will help us in
putting the pointed solution
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
670 18 Annexure B - Critical Functional
Requirements
All data should be made discoverable and
integrable easily through a single virtual layer
which will expose redundancy and quality issues
faster.
Are we referring to only reportable data or data from
all layers of warehouse?
Data from all the layers of Next-Gen DW and all source systems
671 18 Annexure B - Critical Functional
Requirements
Migration from Existing Setup to Proposed
Solution, 2 - Data migration from Staging and Data
Marts, user tables and any other schemas identified
by Bank.
Do we have the Detailed data as Migration will have
challenges in case the Detalied data is not available
Detail current landscape picture with platform and
applications should be provided for an accurate
solution
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
672 19 Annexure B - Critical Functional
Requirements
8 - Migration of Data Governance, Data Lineage and
Data Quality rules and policies
What is the Status of the Current Data Lineage as this
is a critical and important area to know and to Migrate
Does SBI have accurate and detail documentation of
current implementation?
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
673 20 Annexure B - Critical Functional
Requirements
All the applications connected to the non-archived
data should be available with archived as well
Please provide use cases for this reuqirement to
narrow down on type of data access required from
archival data
Access to archival solution is expected to be similar to production setup
674 21 Annexure B - Critical Functional
Requirements
10 - Threshold control to kill the high resource
consumption query
Should not be part of the Monitoring Dashboard part Monitoring dashboard should showcase high resource queries killed due to
crossing threshold limits
675 21 Annexure B - Critical Functional
Requirements
24 - Data Reconciliation status for every data
movement on real time basis
Data Reconciliation should be real time for specific
cases not all
No change in standard clause of EOI. Bidders are free to propose solution(s)
in the best interest of the Bank to meet the requirements given in the EOI.
676 23 Annexure B - Critical Functional
Requirements
10 - Identity resolution - Identity resolution is the
process of linking various records and is the main
engine for record de-duplication, which can enable
some aspects of data cleansing.
Is SBI Looking for Identity resolution system or Full
fleged MDM?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
677 23 Data Quality 1)Provide the attributes count for each of the source
system that needs DQ process
2)Approx. how many master attributes are considered
for data quality processes?
3)Records from how many sources and what source
systems needs to go through the data quality
processes?
4)Any process used at present to assess the
authenticity of records after the DQ process is run?
Who are the people (their profiles - data stewards
etc.) involved in the same?
5)Please indicate, Volume of records on which the
current Data Quality processes are carried out? What
is the % of record increments per week/month?
6)Any data enrichment processes being used at
present? If yes, what are they?
7)Please specify the list of Country/Countries for
which we require address verification and
standardization. Please list the country (ies) of origin
for the name and address data to be processed.
8)Is data quality going to be a onetime data cleansing
effort, or are we required to do this periodically?
9)Is there a requirement for a real-time/near real-
time or post facto data cleansing?
10)Please indicate type of data enrichment sources
that should be used, whether any other third-party
tool to be used.
11)Are there serious data gaps in today's scenario that
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
678 24 Annexure B - Critical Functional
Requirements
15 - Each month 5 data quality use cases will be
developed and implemented on Next Gen DW.
Examples of use cases are as given below;
- Profiling of Customer Master table for verifying
PAN, Mobile Number, Date of Birth, Address, Pin
Code, etc
- Profiling of Branch Master for verifying branch
address, contact information, branch manager/staff
information, etc
are we looking for the Validation and Enrichment from
the Open Government API's like MCA21, UIDAI, NSDL?
Do we have agreement with the Government
agencies?
Also what is the suggestion on Paid data sources?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
679 27 Annexure B - Critical Functional
Requirements
Data Masking What is the Plan for the Data Masking - is this Dynamic
Data Masking or Persistenet Data Masking for the data
porting to non prod environments
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
680 28 Annexure B - Critical Functional
Requirements
8 -Real time reporting can be done through Staging
area of Data Warehouse
Do we want to have all the Data In staging as it is been
done in AS-IS or can bidder suggest other options
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
681 28 Annexure B - Critical Functional
Requirements
Implementing end to end analytics use-cases as
mandated by the Bank
Has any AIML use cases already implemented?
Howmany such cases?
This information is not required at this stage of EOI.
682 29 Annexure B - Critical Functional
Requirements
Data Science Platform with AI/ML Capabilities
Vendor to provide solution / tool (s) for below
scope of activities on SBI data sets;
- Benchmarking
- Predictive & Prescriptive Analytics
- Social Media Analytics
- Web Analytics
- Geolocation Analysis
- Ad-Hoc Analysis
- Trend Indicators
- Profit Analysis
- In-Memory Analysis
- Statistic Analytics
- Data Mining
Do we have any quantification of the models need to
be created as part of the project
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
683 30 Business Intelligence Tools 1)Please provide the source systems details for BI &
Analytics and list the formats of data.
2)Please share the volumetric details of source
systems
3)Please share details for existing data quality and
consistancy across data sources
4)"Do you envisage any need for unstructured data in
future
e.g- Text, XML files, Audit log, files, pictures, social
media."
5)Please provide the number of users to be
provisioned for analytics and business intelligence
tools users from a Licensing standpoint and YOY
growth expected on the same.
6)While proposing licenses, do we go for Perpetual
licenses which would help beyond implementation
and support or subscription based licenses.
7)Please provide Data Retention policies in Target
System. Please specify the time period for the Storage
of on Online Data as well as Archival Data
8)How many years of data considered for history load
and data analysis
9)Please let us know the Number of Reports and
Dashboards expected with complexity level (I.e.
Simple, Medium & Complex) ?
10)How many metadata layers are required for
reporting ?
11)Is Multilingual reporting expected ?
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
684 31 Annexure B - Critical Functional
Requirements
Mobile version: BI tools should be able to
differentiate between viewing BI applications on a
web browser on a mobile device versus a mobile BI
application.
Does bank need reports to be integrated with its
current mobile platform and application orstandard
mobility apps of BI OEMs can be used?
Yes
685 36 Annexure B - Critical Functional
Requirements
36 - NEXT-GEN DW is expected to have more users
and the solution should not be bound by any
license model for number of users
Not all products support Core based Model, pl suggest This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
686 26 Annexure B - Critical Functional
Requirements
High Volume, High Performance and Reliability –
Scalable and resilient architecture which will handle
all volume and performance demands.
Does regulatory requirement need archival data access
aswell?
Yes. It should have the capability to fetch data from Archival solution
687 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on
BI/ETL/Database Level. User concurreny at
BI/ETL/Database. Also what is type/make/core
information of ETL/DWH/ODS/DM Servers.
Duplicate Query. Refer to Sr.No. #280
688 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please
provide backup and other configuration details .
Existing backup are disk based /tape based/frequency
of backups
Duplicate Query. Refer to Sr.No. #281
689 19 Disaster Recovery
clause 1
Bank proposes to setup only functional DR to start
with. At later stage Bank may take decision to
setup full scale 100% DR.
Please elaborate on functional DR in terms of PROD
capacity . Is DR being looked from Day 1?
Duplicate Query. Refer to Sr.No. #282
690 19 DR Clause 4 The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
Since the volumes involved are large , the bandwidth
capacity and the time slots for replication will be
provided by the Bank. Please clarify
Duplicate Query. Refer to Sr.No. #283
691 19 DR Clause 6 The proposed solution is expected to have a
monitoring engine that can determine the health of
production NEXT-GEN DW and raise alerts / trigger
remedial actions to bring NEXT-GEN DW – DR as
the default NEXT-GEN DW
Bank wishes to have an automation tool for the same .
Please clarify.
Duplicate Query. Refer to Sr.No. #284
692 32 HW Specs clause 10 Vendor must provide detailed configuration of the
proposed Hardware, including Hosting Space
Requirements, Racks, Power, Cooling and any other
requirement for the fulfillment of the Vendor’s
obligation in this EOI.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
Duplicate Query. Refer to Sr.No. #285
693 35 HW Specs Clause 30 Vendor is required to provide the minimum
resources to monitor & manage the infrastructure,
however it is the Vendor’s responsibility to right
size the resources to meet the SLA
Clarification needed on Bank's expectations on
number of resources. Can you please explicitly
mention the SLA requirements.
Duplicate Query. Refer to Sr.No. #286
694 36 HW Specs Clause 44 Vendor need to propose a solution for data
migration / transfer between Existing DWH (Navi
Mumbai Location 1) and NEXT-GEN DW-PR (Navi
Mumbai Location) and also between NEXT-GEN DW-
PR (Navi Mumbai Location 2) and Hyderabad (DR)
or any other places for PR and DR decided by the
Bank.
Please share details of locations , approx distances ,
bandwidth capacity to be provided by Bank. Please
clarify.
Duplicate Query. Refer to Sr.No. #287
695 36 HW Specs Clause 48 The vendor should provide EXACT size needed for
production in the 1st year and estimated sizes for
consecutive years keeping in view the growth rate
predicted by Bank in this section and provide
empirical evidence for the calculation of growth
rate.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
Duplicate Query. Refer to Sr.No. #288
696 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of
existing EDW & 2. Breakup of tiers among the
80+ production servers.
Duplicate Query. Refer to Sr.No. #289
697 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Duplicate Query. Refer to Sr.No. #290
698 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA
implementation for these use cases . Alternatively willl
the OFSAA feed from the DWH. Please let us know the
interplay between DWH and OFSAA.
Duplicate Query. Refer to Sr.No. #291
699 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a
complement to existing data solutions at the
bank (e.g. OFSAA) ?
Duplicate Query. Refer to Sr.No. #292
700 40 Annexure E Functional Use Cases - role of Next-Gen-DW in mission critical
operational processes e.g. daily RBI and regulator
reporting, related time financial crime detection,
etc.
Duplicate Query. Refer to Sr.No. #293
701 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH
Solution. Information including but not limited to:
DWH platform, model, version and Size
Details of CPUs, Memory, OS, Database version
Details of Storage configuration: Size, capacity, free
and used space etc
Whether there are any physical or logical isolation of
DWH setup
HA, Back, DR details
Currently we are using IBM Stack. Bidders are expected to propose high level
migration plan.
702 16 4 Storage replication (e.g. RAID) should be
automatically managed by the platform.
We believe this is a RAID level if not then please
elaborate more
Duplicate Query. Refer to Sr.No. #296
703 16 8 Storage should support data compression. It should
be possible to perform both fast compression and
efficient compression based on data processing
needs.
We recommended Bank to additionally ask for Storage
and DWH systme capable of providing Coumnar based
as well row based compression. And data should be
readable without decompression
Duplicate Query. Refer to Sr.No. #297
704 16 9 The storage should be horizontally and vertically
scalable. Redistribution of data across the NEXT-
GEN DW should be possible automatically and
seamlessly.
We recommended Bank to additionally ask for
"Storage upgrade and cpacity increase should be done
without dontime
Duplicate Query. Refer to Sr.No. #298
705 16 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
1. Please provide details of queries and ETL jobs
2. Please provide ratio of Select/DML queries and ETL
jobs
3. Share the complexity of query- simple, medium,
complex Queries
4. Also provide Query defination of Simple, Medium
and Complex queries
Duplicate Query. Refer to Sr.No. #299
706 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases. Construction of this
separate Database snapshot and enabling this audit
trails must not cause any major systemic
issues/challenges in smooth functioning of primary
DB.
Please provide rationale and logic to have separate
processing and storage for this. Powerful DHW
systems are today capable of servicing all consumer
groups in parallel
Duplicate Query. Refer to Sr.No. #300
707 17 12 ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Please clarify further with example Duplicate Query. Refer to Sr.No. #301
708 17 13 Existing reports and extracts generation jobs on
DWH should be analyzed and transformed to the
NEXT-GEN DW. The vendor should use preferably
off-the- shelf tools and not resort to building from
scratch.
Please share sample reports and extracts to propose
most suitable options for migration
Duplicate Query. Refer to Sr.No. #302
709 17 Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real
plz. Define complexity - simple, medium, complex
Share the split % of simple, medium, complex
Plz. Share sample jobs for each type
Duplicate Query. Refer to Sr.No. #303
710 27 User Management: Pt4-The access privileges
associated with each system product, e.g.
operating system, network, database, application
and system utilities, and the users to which these
privileges need to be allocated should be clearly
identified and documented.
Should we assume that the access privileges are to be
assigned to the users directly and managing access to
these privileged accounts is not required?
Duplicate Query. Refer to Sr.No. #304
711 39 User database of 30000+ officials Should we assume approx 30k users would access the
solution with a YoY increase of 10%?
Duplicate Query. Refer to Sr.No. #305
712 30 6 Reporting on all types of available of Data Formats;
· Structured, semi-structured, unstructured
· Click stream data
· Audit Logs
· Documents
· Multimedia data (Images/Videos/Audios)
· XBRL format
· IRIS iFILE framework
Please clarify on the type of database used for the
each of data formats asked for. Or Is it safe to assume
the underline data will be in as per industry specified
relational data format like Oracle , DB2 etc.
Duplicate Query. Refer to Sr.No. #306
713 30 5 Visualizations: BI tools must provide below
different types of visualizations;
· Animations, Barcodes
· Bar, line, pie, area and radar chart types
· Tables, Graphs, Infographics, Filters
· Widgets
· Drag and Drop Creation, Customization
· Templates
· Freehand SQL Command
· Geospatial Integration
· Layouts
· Themes
· Ability to mix and match various combinations
Please elaborate on the definition of
-Animations
-Infographic :what all visualization are you referring to
-Widgets, Templates
Duplicate Query. Refer to Sr.No. #307
714 32 22 In-memory analytics: The product should pull data
into an in-memory or locally cached data store
preferably columnar is an increasingly popular
feature that enables very fast analytics once the
data is loaded.
To achieve the fast analytics, BI tool may adopt
different architecture. BI tool can easily leverage the
in-memory benefits of underline database without
pulling and creating data redundancy and henceforth
reducing the data manageability at BI Layer . Request
you to rephrase the this point as
"In-memory analytics: The product should pull data
into an in-memory or leverage the In-Memory
capabilities of underline database Or locally cached
data store preferably columnar is an increasingly
popular feature that enables very fast analytics once
the data is loaded."
Duplicate Query. Refer to Sr.No. #308
715 32 23 Offline updates: BI tools, when storing copies of
the source data in an online analytical processing
(OLAP) cube or in-memory columnar data store,
should enable business users to schedule
automatic data updates.
Different BI tools have different architecture. BI tool
can easily leverage the capabilities of underline
database. Is this point relevant to the BI Tools whose
architecture is to store the data with BI Server.
Duplicate Query. Refer to Sr.No. #309
716 32 28 Speed of access: Query performance will vary based
on the complexity of the queries and the amount of
data involved. Dashboards with multiple
visualizations will need to get query results from
many queries. The best practice is to create several
prebuilt query scenarios and compare how each
product performs based on these specific
examples. The worse practice is to just arbitrarily
rate the speed.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Duplicate Query. Refer to Sr.No. #310
717 32 29 The best practice is to establish a testing
environment to determine scalability in terms of
both the number of concurrent users and data
metrics, such as volumes, variety and veracity.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Duplicate Query. Refer to Sr.No. #311
718 32 32 Ability to handle and summarize huge volumes of
data. E.g. 30-40 million rows accessed on index and
summarized over 5 to 8 metrics.
Please elaborate the use case for consumption of 30-
40 million rows from BI. Usually BI tool leverages the
underline database to do summarization of data and
only works on the resulted dataset
Duplicate Query. Refer to Sr.No. #312
719 35 36 The web portal of Business Intelligence tool should
support at-least 25000
concurrent users, scalable up to 75000 in next 5
years, accessing various reports generated
For doing the sizing of Business Intelligence we need
bifurcation of the concurrent users (25000)
Total Concurrent Users : 25000
Number of concurrent active : provide the concurrent
active user count
Number of logged/in-active : provide the
loggedin/active user count
Out of Active Users:
- Users executing BIEE dashboards (having 4/5 reports
or simple charts in a dashboard)
- Users executing large Pivot table operations (25000+
rows)
- Users executing (small to medium sized report - 50K
cells or lower) export to pdf/XL operations
- Users executing very heavy Graphics
Number of Active Concurrent running Extra Large
Reports
(Usually Extra Large Reports are executed off-line
hours)
Duplicate Query. Refer to Sr.No. #313
720 15 point # 16 Proposed solution should be able to scrap
encrypted log, capture Metadata
changes at source level completely, scrapping 4000-
5000 logs daily having log
size of ~ 2 TB each scalable up to 10000 logs.
Proposed solution should be
capable of scrapping logs generated by any type of
Database. E.g. Oracle
Database, IBM DB2 Database etc.
Kindly specify if decryption of logs is also required or
only storage of such logs is fine ? If yes, is there
decryption logic available in specified system ?
Duplicate Query. Refer to Sr.No. #314
721 38 Annexure C -Monthly Data processed
in DWH Warehouse
Archived log extract CBS (SBI) +
TF (SBI)
Are these logs encrypted?
Do we need to keep the RAW logs into the system? Or
only processed logs ?
Duplicate Query. Refer to Sr.No. #315
722 29 # 9 (Data Science Platform with
AI/ML Capabilities)
GPUs to be incorporated in solution if possible
using HDFS Hadoop like
environment for better analytical results
1. Is there requirement to run AI/ML models within
HDFS Hadoop ? Or Expectation is to pull the data into
GPU based analytics workbench and then process.
2. Running AI/ML models within Hadoop is also faster
and Having separate GPU based system for specific AI
models can reduce the cost of GPU based solution.
Please suggest
Duplicate Query. Refer to Sr.No. #316
723 15 # 1 - Data Storage A multi-temperature data management solution to
be proposed by vendor where
data that is frequently accessed on fast
storage—hot data—compared to lessfrequently
accessed data stored on slightly slower
storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should
also be capable automated storage tiering and
seamless data transfer between
hot, warm and cold storage. Data residing in any of
these storage areas must be
seamlessly mixed / merged according to
requirements without impacting
performance.
Kindly share the tentative timeline for Hot/Warm/Cold
data so that we could calculate the size. Example: Hot
Data - 6 months, Warm Data - 1 year, Cold data > 1
year etc.
Duplicate Query. Refer to Sr.No. #317
724 18 18 Transformations for this activity can be categorized
into the following types:
· Existing transformations in DWH that needs to be
migrated to NEXT-GEN DW
Please share the existing transformation details. Duplicate Query. Refer to Sr.No. #318
725 20 1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Please provide Retention period Duplicate Query. Refer to Sr.No. #319
726 20 5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Please provide the details of existing Tape Backup
Solution, Backup Window, Backup Throughput and
Restoration throughput
Duplicate Query. Refer to Sr.No. #320
727 13 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Ingestion Sr #2
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Which RDBMS source data is required to be extracted
in real-time mode? Please provide the source system
name, RDMS type (Oracle/ SQLServer etc) and the
underlying OS
Duplicate Query. Refer to Sr.No. #321
728 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #5
Migration of existing data extraction and reporting
jobs.
Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #322
729 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #6
Migration of monitoring dashboard data points. Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #323
730 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #7
Migration of user details. Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #324
731 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #8
Migration of Data Governance, Data Lineage and
Data Quality rules and policies
Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #325
732 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Will these also be flat files? If not, what will be the
interface mode (RDBMS, webServices, API etc) and
which RDBMS?
Duplicate Query. Refer to Sr.No. #326
733 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Please provide a count of data sources (best
approximation, and of these how mant will be flat file
sources?
Duplicate Query. Refer to Sr.No. #327
734 23 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Quality Sr #12
Mechanism to capture feedback from end users to
report Data Quality issues
Please elaborate. Can this be implemented using
enterprise collaboration tooling / ticket maintenance
system?
Duplicate Query. Refer to Sr.No. #328
735 39 Annexure D - Existing Data
Warehouse Architecture Sr #14
Data Quality What are the existing Data Quality details? How many
and which entity masters are maintained? What is the
current count of each type of Entity and how are their
counts expected to scale up (volumetrics)?
This information is not required at this stage of EOI. Bidders are free to
propose solution(s) in the best interest of the Bank to meet the requirements
given in the EOI.
736 35 24 Vendor needs to provide Helpdesk support 24X7 to
the Bank for end to end support for hardware
maintenance.
Please confirm Helpdesk services needs to be onsite or
from Wipro's remote delivery location SNXT, Mysore.
Onsite only
737 Please mention the SLA (resolution, response time,
etc.) for Infrastructure Operation and Maintenance
(L1, L2, L3) along with Priority definitions of Incidents
and Service Requests.
This information is not required at this stage of EOI.
738 Please confirm ITSM tool /Helpdesk tool will be
provided by SBI.Also pls name the ITSM tool present in
the environment.
Bidder will have to propose required tool. Bank will take a call on it at
appropriate time.
739 Please confirm ITAM tool /Asset management tool
will be provided by SBI.Also pls name the ITAM tool
present in the environment.
Bidder will have to propose required tool. Bank will take a call on it at
appropriate time.
740 Please confirm the EMS Tool/ Infrastructure
monitoring tool present in SBI's environment. Also
please confirm Wipro can leverage the same for
monitoring.
Bidder will have to propose required tool. Bank will take a call on it at
appropriate time.
741 please confirm if patch management tool is available
in the environment and the same will be extended to
Wipro .
Bidder will have to propose required tool. Bank will take a call on it at
appropriate time.
742 SI assume DC and DR being on Cloud, there is no need
for Hands and Feet Support. However Please confirm
if SI needs to provide "Hands and Feet Support" for
any DC/ DR hardware. If yes, please provide the
location (pin code) and service window.
This information is not required at this stage of EOI.
743 Please confirm on Back up software available in the
environment and the same will be extended to Wipro
Bidder will have to propose required tool. Bank will take a call on it at
appropriate time.
744 Please confirm the DC/DR drill frequency This information is not required at this stage of EOI.
745 27 1 Please confirm any BCP tools/solution present with SBI
and the same can be extended to Wipro
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
746 13
specification
Please clarify what is meant by internal and external
specification.
Duplicate Query. Refer to Sr.No. #637
747 34 15 The Vendor shall ensure all Installations &
Implementation to be done by OEM badged
resources only
Request the Bank to modify this to OEM/Bidder
resources.
Duplicate Query. Refer to Sr.No. #638
748 15 Architecture diagram Deployment plan - Vendor to
submit architecture diagram of entire setup with
network and security equipment required. Bank
may change it after vetting by Information Security
Dept and / or Enterprise architecture Dept. It will
be binding on vendor.
Will this setup be completely green field? Is the bidder
expected to supply the network and security
equipment?
Duplicate Query. Refer to Sr.No. #639
749 34 18 The proposed hardware is mission critical for the
proposed project and support of 24 X 7 with an
uptime of 99.99 % to be ensured by providing
support at PR, and DR site for a period of 5 years.
Is bidder expected to provide onsite resources for
every domain? Please clarify.
Duplicate Query. Refer to Sr.No. #640
750 35 22 The Hardware solution must be compatible to
integrate with various systems in the Bank
including but not limited to SOC, PIMS, NOC,
Command Centre, ITAM, Service Desk, ADS, and
SSO etc. at no extra cost. Vendor will have to give
appropriate support to the Bank during integration
with various components of IT environment.
Please clarify what integration is expected? Duplicate Query. Refer to Sr.No. #641
751 4 Background Data Insights / Patterns/ Real-Time The Elasticsearch Functionalities required for
addressing it is not clearly mentioned.
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
752 12 End State Objectives AI/ML Capabilities & Real-Time Analytics. Log
Analytics
The latest ElasticSearch Functionalities to address the
requirement is not coming out clearly in the document
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
753 22 Data Dictionary Search
Audit Trail/Log
Request to specify in detail
This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
754 26 Audit & Log Management Audit & Log Mgmt Are Latest Features n Functionality for Audit & Log
Analytics required?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
755 40-43 Functional Use Cases Customer / Product / Fraud Use Cases Are he advanced capabilities for searching & analyzing
data to detect anomaly, patterns, fraud, segmentation
etc required?
Yes, Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
756 NA General Query Security & Cyber Insurance Are Log Analytics, Metrics, Security Analytics required
as part of proposed solution?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
757 NA General Query Enterprise Search Is Enterprise & Application Search capability required
as part of the proposed solution?
Bidders are free to propose solution(s) in the best interest of the Bank to
meet the requirements given in the EOI.
758 11 Annexure-I Eligibility Criteria New Point. EoI is silent on Consortium bidding Whether EoI can be submitted by Consortium of 2
companies? If not, can a bidding agency sub-contract
few niche components at later stage?
The bidder will be single organization and may have arrangements with
various other organizations at the back-end. The nature of the arrangement
at the back-end will be decided by the Bank at appropriate time.
759 New Point Whether the Bank has chosen any Technology Stack ? This is a solution discovery phase. Hence we are asking for best possible
technologies to give best performance in proposed solution. Bidders are free
to propose suitable solution to meet the requirements of this EOI.
760 New Point Whether the Bank is open to "Open Source Software
tools and Platforms?
Please refer to EOI clauses for more details
761 New Point Should the Next Gen DWH solution be deployed on
premise or cloud or Hybrid?
Public and private clouds deployed in non-SBI data centers are not required
as part of solution.
762 New Point During RFP Stage, SBI is requested to keep a turnover
limit where-in niche product/ solution companies like
Posidex would be able to bid. They should also permit
either consortium bidding or allow sub-contracting of
few specialised tasks viz. entity resolution/ creation of
Golden Record/ use of block-chain etc
This information is not required at this stage of EOI.
763 13 2 Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Is the NOSQL data mentioned here JSON/XML ? What
kind of processing on NOSQL is expected ? Is there a
need to join the NOSQL data with other relational data
? Is there a need to shred the NOSQL data into
relational data ?
Duplicate Query. Refer to Sr.No. #189
764 14 11 Data Ingestion
One of the most important feature is the richness
of the transformations to do day-to-day tasks, such
as;
Data conversion, lookup, expression, joining
records, splitting data, filtering, ranking, sorting,
grouping, looping, and combining data,
pivot/unpivot, converting dates, setting variables
based on parameter files, merging rows, finding the
latest file, and splitting data based on certain
conditions, running web methods, transforming
XML documents, rebuilding indexes, sending
emails, profiling data, handling arrays and records,
processing unstructured data, masking, monitoring
the inbound data flow for completeness,
consistency and accuracy, wizards to assist creating
complex packages, like loading fact tables, or type
two slowly changing dimensions (SCD – T2)
Are any tools / licenses already available with SBI for
ingesting data used in the existing DW solution?
Please provide a list. Any existing tools that align with
the new solution can be considered for reuse if found
to be a good fit.
Duplicate Query. Refer to Sr.No. #190
765 15 1 Data Storage
Vendor should propose effective number of data
storage layers in NEXT-GEN DW between data
ingestion and data consumption.
In the existing DW solution, are surrogate keys used?
If yes, is there a framework for storage and
management of the keys to ensure robustness of the
data warehouse?
Duplicate Query. Refer to Sr.No. #191
766 16 11 Data Storage
The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
Will all these queries be executing in-flight at the
same time ? Or will these be initiated over a period of
time and the aggregate number of queries run during
that time period will be 600,000 ?
Duplicate Query. Refer to Sr.No. #192
767 16 13 It should be possible to project and view data
through multiple modes using the Storage on NEXT-
GEN DW. Varieties of GUIs should be available to
project or view the output generated through
analytic processes. For instance: The Bank may
decide to implement use-cases that project
transactions data as a graph data structure. The
Storage solution on NEXT-GEN DW should allow for
such projections.
Do we need to enable graph based analytics as well or
it should be only limited to the facility to access the
data for graph based analytics.
Duplicate Query. Refer to Sr.No. #193
768 17 1 For the data to be accessible and consumable by
businesses / downstream applications, the NEXT-
GEN DW should have robust, highly efficient and
parallel execution of data transformation jobs.
We can enable JDBC/ODBC or REST API based access.
Will there be any specific mechanism to connect like
specific type of drivers needed for connectivity with
SAS system etc.
Duplicate Query. Refer to Sr.No. #194
769 17 2 Data Processing Framework
The NEXT-GEN DW ecosystem should have state of
the art data processing engines that can perform in-
memory processing to reduce the time for data
transformations and query in case of real time
requirements.
What are the expected latency requirements for real
time processing? The solution may based on use case
if requirement is for immediate processing vs latency
of upto 5-10 minutes.
Duplicate Query. Refer to Sr.No. #195
770 17 3 Framework should allow joining multiple
sources/tables/inputs etc.
The source here means the data ingested from the
multiple source systems and present on NEXT-GEN
DW platform not with the data at the source systems.
Duplicate Query. Refer to Sr.No. #196
771 17 5 Framework should be capable of performing
validation checks pre-and post- processing.
What will be the outcome of validation? Duplicate Query. Refer to Sr.No. #197
772 17 8 Data Processing Framework
Should have audit and error logs for auditing and
troubleshooting
Is there is requirement to maintain row-level
traceability of the data records, i.e. from the data
consumption layer backwards up to the originating
source file for a particular record?
Duplicate Query. Refer to Sr.No. #198
773 17 9 Automatic recovery of data after failure/rejection
of record needs to happen without any manual
intervention
Is there any specific treatment that needs to be
performed for rejected records?
Duplicate Query. Refer to Sr.No. #199
774 17 11 Framework should have mechanism to protect data
at rest and at motion from unauthorized user
access and amendments.
Is it required to encrypt data at rest and in motion? Is
there a data masking tool available with the bank. Is
there a need for data masking
Duplicate Query. Refer to Sr.No. #200
775 17 14 Data Processing Framework
Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real time mode every day.
Are bulk of the data transformation jobs expected to
be triggered during non-business hours, when user
reporting or other workloads are at a minimum? Are
there going to be users across multiple timezones or
large part of the user base will be within a single
timezone?
Duplicate Query. Refer to Sr.No. #201
776 17 17 The processing pipelines for ETL/ELT jobs also
include real time, daily, weekly, monthly, quarterly
and annual reports, feeding data structures for
downstream consumption. These activities are in-
scope for this engagement.
How many such reports are there and what is the data
model for them this is to estimate the number of ETL
required for the end system.
Duplicate Query. Refer to Sr.No. #202
777 17 19 The workflows should work with standard
schedulers. Monitoring and management of
workflows should be possible from an easy to use
interface. Workflow management tool(s) should
have connectors / pluggable interfaces to already
existing / in-use proprietary software available with
the Bank. These could be (and not restricted to)
data repositories, reporting tools, data analysis
tools and generic interfaces for data transfer.
Scheduled jobs status should be made available to
the Bank in Monitoring dashboard on real time
basis.
What are the supported mechanism with proprietary
tool? Does that tool support REST based integration?
Which scheduling and monitoring tool does bank
have?
Duplicate Query. Refer to Sr.No. #203
778 18 1 Migration from Existing Setup to Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
Integration of real time data with the data on the
NEXT GEN DW is possible but does this requirement
mean to integrate with multiple source systems? Do
we get the access to the source systems directly.
Duplicate Query. Refer to Sr.No. #204
779 18 1 Migration from Existing Setup to Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
What is the defined reconciliation mechanism is it a
point in time based? Because all the system will be at
have different data based on the time a execution of
ETL job frequency.
Duplicate Query. Refer to Sr.No. #205
780 18 2 Migration from Existing Setup to Proposed Solution
Data migration from Staging and Data Marts, user
tables and any other schemas identified by Bank.
Along with the Staging and Data Mart objects, is there
any integrated data layer in the existing solution? If
yes, then is it built using any proprietary data model?
Duplicate Query. Refer to Sr.No. #206
781 29 7 Data Science Platform with AI/ML Capabilities
In-memory computing & integration with Spark,
Redis, etc
What kind of analytic processing is expected on Spark,
Redis, etc. ? Is this using Spark-ML, for example ?
Duplicate Query. Refer to Sr.No. #207
782 29 18 Data Science Platform with AI/ML Capabilities
All machine-learning platforms either support
multiple models out of the box or provide an
option to custom-code the same
What kind of use cases for ML are expected so as to
understand need for existing out of the box vs. custom
solutions ?
Duplicate Query. Refer to Sr.No. #208
783 29 19 Data Science Platform with AI/ML Capabilities
Integration with R, Python, Keras,
Tensorflow,Theano, scikit-learn etc and other
frameworks / languages
Is there a need to connect with any other analytic
engines to be run in an parallelized/distributed
manner on the system ?
Duplicate Query. Refer to Sr.No. #209
784 29 24 Data Science Platform with AI/ML Capabilities
Annexure E gives sample use cases which are to be
implemented on Next Gen Data Warehouse using
structured and/or unstructured and/or semi-
structured and/or any other kind of data gathered
from either Data Warehouse or Data Lake or Data
Virtualization or all together or any other source.
Is there a need to read data directly from a low cost
storage system and do complex analysis/queries
involving multi-table joins with curated relational data,
using SQL/analytic functions which requires
performance and scalability ?
Duplicate Query. Refer to Sr.No. #210
785 35 33 Hardware Specifications
Next-Gen DW should support at-least 500
concurrent users, scalable up to 1000 users in next
5 years, running ETL/ELT jobs or doing ad-hoc data
extraction requests on database (Not including API
based access or scheduled job connections to
database)
What is the nature and expected concurrency of API
based access ? What is nature and concurrency of
scheduled job connections - are these ETL, or
maintenance related connections ?
Will the 1000 concurrent users be expected to be
running queris simultaneously, or is this just 1000
concurrent logons ?
Duplicate Query. Refer to Sr.No. #211
786 36 37 Hardware Specifications
Ad-hoc jobs of any complexity should not hamper
the scheduled jobs performance.
What is expected mix of queries in terms of tactical
(very short), medium, long running (reports), batch
loads, near real-time and real-time loads running
simultaneously on the system ?
Duplicate Query. Refer to Sr.No. #212
787 18
Migration from Existing Setup to
Proposed Solution
Vendor should propose a detailed seamless
automated migration plan from existing setup to
proposed solution. Plan should focus on less
manual intervention, data reconciliation between
the systems and minimum parallel run of existing
and proposed solution.
1. How frequent are changes to the code ?
2. What is the defined process to capture the change
requests?
3. What version control tool is currently being used ?
4. What is their current release management process?
5. What is definition of Minimum Parallel Run ?
-6. Can the work be done from Teradata offshore
locations or has to be done from onsite?
7. What is the current Testing Strategy ? Is there any
document?
8. Is there any Performance related expectations?
9. What is the inventory numbers of the objects ( if
possible by complexity ) of the existing DWH which
needs to be migrated?
10. Can the legacy code be shared for pattern analysis
? ( If not complete code base, then can a sample be
shared ?)
11. What kind of Test Automation Tools are used
currently?
12. What is the typical availability of customer SMEs
for UAT ?
13. Does the Customer have the business
reconciliation queries? In other words, How do they
verify the data loaded in the current environment?
14. Does the customer have an existing Test/QA
environment ?
15. What kind of documentation is required as a part
of deliverables?
Duplicate Query. Refer to Sr.No. #213
788 18
Migration from Existing Setup to
Proposed Solution
Data migration from Staging and Data Marts, user
tables and any other schemas identified by Bank.
1. How many more data marts are there other than 4?
2. Are these data marts built on different databases?
3. Does their DWH comprise of multiple data marts
only or they have an Integrated EDW and model
already in place?
4. Is there any Subject Area Priority (logical split of the
Next Gen DW) & anticipated sizing?
5. What are the SLA times of current ETL jobs?
6. Does the large tables are horizontally partitioned?
Duplicate Query. Refer to Sr.No. #214
789 18
Migration from Existing Setup to
Proposed Solution
Data migration from existing archival solution to
new one.
1. What is the existing Archival Solution ? Duplicate Query. Refer to Sr.No. #215
790 18
Migration from Existing Setup to
Proposed Solution
Migration of existing data sourcing ETL jobs. 1. Do they maintain/update Data Mapping Sheet for
ETL /Data Ingestions ?
2. Is the data currently loaded in Batch or Mini-Batch ?
3. Do they have any Design and Coding Standards ?
4. Do they have document on the current ETL
Architecture/Solution, code patterns and their
complexity ?
5. Which Scheduler is being used ?
6. What is the current data volume and how much
data is ingested through different ingestion
mechanisms (batch, real-time etc) ?
7. Do they want to change any current tools ( ETL )
they have ? If yes, then what would be the tool stack?
8. Do they currently have an ETL Control Framework
implemented?
Duplicate Query. Refer to Sr.No. #216
791 19
Migration from Existing Setup to
Proposed Solution
Migration of monitoring dashboard data points 1. Is it to show the progress of migration underway?
2. Is there a need to build a migration framework for
future data migration ?
Duplicate Query. Refer to Sr.No. #217
792 19
Migration from Existing Setup to
Proposed Solution
Migration of Data Governance, Data Lineage and
Data Quality rules and policies
1. What are the existing Data Governance, Data
Lineage and Data Quality rules and policies ?
Duplicate Query. Refer to Sr.No. #218
793 19
Migration from Existing Setup to
Proposed Solution
Migration of All the remaining components of
existing ecosystem (Mentioned in Annexure - D) as
and when identified by Bank like job scheduler,
reports, history of version control, existing tape
backup, etc.
1. Will the access be provided to all systems - what
would be the constraints ?
Duplicate Query. Refer to Sr.No. #219
794 19
Migration from Existing Setup to
Proposed Solution
Vendor should list out all types of risks they expect
during the migration. Vendor should provide
justification if any downtime is required on existing
or proposed system during migration. Vendor
should provide all the pre-requisites for the
migration in the proposal.
1. What are the source systems (ERP, CRM etc.) does
the current DWH have ?
2. What is the current downtime schedule for their
existing DWH ?
Duplicate Query. Refer to Sr.No. #220
795 19
Migration from Existing Setup to
Proposed Solution
Vendor to review the existing architecture during
migration and remove duplication of data and
recommend improvements in overall setup if any
1. What are the deduplication rules ? Do they
currently have any defined?
Duplicate Query. Refer to Sr.No. #221
796 19
Migration from Existing Setup to
Proposed Solution
Vendor should provide a feasible plan for best use
of existing infrastructure which is procured during
last 10 years in staggered manner during the
implementation of Next-Gen DW which will save
cost to the Bank. (Annexure D gives the technology
architecture of the current setup)
1. Need more detailed information for their existing
Infrastructure and eco-system.
Duplicate Query. Refer to Sr.No. #222
797 13 Scope of Work Team structure (without actual profiles) Does the bank anticipate OEM involvement in
implementing core OEM related services via the
Systems Integrator
Duplicate Query. Refer to Sr.No. #223
798 13 Critical Functional Requirements -
Data Ingestion
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Ratio of split of Structured:Semi-Structured;
Unstructured (Images): unstructured (Videos) data.
This helps in solutioning taking into consideration
ground realities
Duplicate Query. Refer to Sr.No. #224
799 25 Critical Functional Requirements -
Elements based Reporting
Vendor should follow the RBI guideline in
developing the solution with which it will be easier
for the Bank to migrate to the element-based data
reporting envisaged by the RBI.
Please elaborate with an example or 2 explaining
elements based Report to have a common
undersanding
Duplicate Query. Refer to Sr.No. #225
800 42 Annexure E- Functional Use Cases
(Risk Area)
General Would the bank expect Graph Analytics capabilities in
the solution for better network analytics which helps
determine the betweeness and strength of network
Duplicate Query. Refer to Sr.No. #226
801 25, 39 Critical Functional Requirements -
Regulatory Reporting
Automation –Tool should automate analytics and
reporting workflow end-to-end, including all data
collection, enrichment, and management, as well as
all calculations, processes to final report
submission. Currently 500+ jobs are being used for
Tranche 1 DCT generation along with 500 more for
other regulatory reports/returns.
Please provide the number of returns/reports for
regulatory body over and above the ones listed in
Annexure E (Sno 4)
Duplicate Query. Refer to Sr.No. #227
802 General General General By when is the RFP expected and by when is the bank
expecting to conclude this.
This information is not required at this stage of EOI.
803 General General General The reason for this question is, if the contract of
existing Datawarehouse ecosystem with existing
vendor is nearing completion, then this will have a
direct bearing on migration strategy as well as
extension of licencing till such time the migration from
existing to new system takes place (may take few
months)
Duplicate Query. Refer to Sr.No. #228
804 General General General We understand that SBI has an MDM solution, hence
the data quality issues, duplication must be under
control. What DQ tools and quality is expected for the
EDW
Duplicate Query. Refer to Sr.No. #229
805 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on
BI/ETL/Database Level. User concurreny at
BI/ETL/Database. Also what is type/make/core
information of ETL/DWH/ODS/DM Servers.
Duplicate Query. Refer to Sr.No. #280
806 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please
provide backup and other configuration details .
Existing backup are disk based /tape based/frequency
of backups
Duplicate Query. Refer to Sr.No. #281
807 19 Disaster Recovery
clause 1
Bank proposes to setup only functional DR to start
with. At later stage Bank may take decision to
setup full scale 100% DR.
Please elaborate on functional DR in terms of PROD
capacity . Is DR being looked from Day 1?
Duplicate Query. Refer to Sr.No. #282
808 19 DR Clause 4 The DR solution should be synced with production
NEXT-GEN DW. The SLA for RTO should be
maximum 2Hrs as per Bank’s defined policy.
Since the volumes involved are large , the bandwidth
capacity and the time slots for replication will be
provided by the Bank. Please clarify
Duplicate Query. Refer to Sr.No. #283
809 19 DR Clause 6 The proposed solution is expected to have a
monitoring engine that can determine the health of
production NEXT-GEN DW and raise alerts / trigger
remedial actions to bring NEXT-GEN DW – DR as
the default NEXT-GEN DW
Bank wishes to have an automation tool for the same .
Please clarify.
Duplicate Query. Refer to Sr.No. #284
810 32 HW Specs clause 10 Vendor must provide detailed configuration of the
proposed Hardware, including Hosting Space
Requirements, Racks, Power, Cooling and any other
requirement for the fulfillment of the Vendor’s
obligation in this EOI.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
Duplicate Query. Refer to Sr.No. #285
811 35 HW Specs Clause 30 Vendor is required to provide the minimum
resources to monitor & manage the infrastructure,
however it is the Vendor’s responsibility to right
size the resources to meet the SLA
Clarification needed on Bank's expectations on
number of resources. Can you please explicitly
mention the SLA requirements.
Duplicate Query. Refer to Sr.No. #286
812 36 HW Specs Clause 44 Vendor need to propose a solution for data
migration / transfer between Existing DWH (Navi
Mumbai Location 1) and NEXT-GEN DW-PR (Navi
Mumbai Location) and also between NEXT-GEN DW-
PR (Navi Mumbai Location 2) and Hyderabad (DR)
or any other places for PR and DR decided by the
Bank.
Please share details of locations , approx distances ,
bandwidth capacity to be provided by Bank. Please
clarify.
Duplicate Query. Refer to Sr.No. #287
813 36 HW Specs Clause 48 The vendor should provide EXACT size needed for
production in the 1st year and estimated sizes for
consecutive years keeping in view the growth rate
predicted by Bank in this section and provide
empirical evidence for the calculation of growth
rate.
For exact sizing various inputs/intercation will be
required . For EOI an approx. indication should suffice
. Please confirm
Duplicate Query. Refer to Sr.No. #288
814 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of
existing EDW & 2. Breakup of tiers among the
80+ production servers.
Duplicate Query. Refer to Sr.No. #289
815 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Duplicate Query. Refer to Sr.No. #290
816 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA
implementation for these use cases . Alternatively willl
the OFSAA feed from the DWH. Please let us know the
interplay between DWH and OFSAA.
Duplicate Query. Refer to Sr.No. #291
817 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a
complement to existing data solutions at the
bank (e.g. OFSAA) ?
Duplicate Query. Refer to Sr.No. #292
818 40 Annexure E Functional Use Cases - role of Next-Gen-DW in mission critical
operational processes e.g. daily RBI and regulator
reporting, related time financial crime detection,
etc.
Duplicate Query. Refer to Sr.No. #293
819 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH
Solution. Information including but not limited to:
DWH platform, model, version and Size
Details of CPUs, Memory, OS, Database version
Details of Storage configuration: Size, capacity, free
and used space etc
Whether there are any physical or logical isolation of
DWH setup
HA, Back, DR details
Duplicate Query. Refer to Sr.No. #701
820 13 Annexure B Performance Benchmark of Next Gen DWH Oracle provides Industry standard globally acceptable
metrices to measure Database performance like IOPS,
throughput and Load rate. We believe this addresses
the requirement
Duplicate Query. Refer to Sr.No. #295
821 16 4 Storage replication (e.g. RAID) should be
automatically managed by the platform.
We believe this is a RAID level if not then please
elaborate more
Duplicate Query. Refer to Sr.No. #296
822 16 8 Storage should support data compression. It should
be possible to perform both fast compression and
efficient compression based on data processing
needs.
We recommended Bank to additionally ask for Storage
and DWH systme capable of providing Coumnar based
as well row based compression. And data should be
readable without decompression
Duplicate Query. Refer to Sr.No. #297
823 16 9 The storage should be horizontally and vertically
scalable. Redistribution of data across the NEXT-
GEN DW should be possible automatically and
seamlessly.
We recommended Bank to additionally ask for
"Storage upgrade and cpacity increase should be done
without dontime
Duplicate Query. Refer to Sr.No. #298
824 16 11 The storage system should be robust to handle at
least 1,50,000 concurrent queries (Select/DML) by
processing engines / ETL jobs / end users scalable
up to 6,00,000 concurrent queries in next 5 years
(assuming parallelism of 100 degree).
1. Please provide details of queries and ETL jobs
2. Please provide ratio of Select/DML queries and ETL
jobs
3. Share the complexity of query- simple, medium,
complex Queries
4. Also provide Query defination of Simple, Medium
and Complex queries
Duplicate Query. Refer to Sr.No. #299
825 16 12 Downstream departments (data consumer) to be
given separate processing power, storage to
undertake their requirements with separate DB
snapshot, Audit trails should be available for any
user accessing the Databases. Construction of this
separate Database snapshot and enabling this audit
trails must not cause any major systemic
issues/challenges in smooth functioning of primary
DB.
Please provide rationale and logic to have separate
processing and storage for this. Powerful DHW
systems are today capable of servicing all consumer
groups in parallel
Duplicate Query. Refer to Sr.No. #300
826 17 12 ETL/ELT tool for data extraction should be AI/ML
features for suggesting / improving Query / ETL /
ELT Stages
Please clarify further with example Duplicate Query. Refer to Sr.No. #301
827 17 13 Existing reports and extracts generation jobs on
DWH should be analyzed and transformed to the
NEXT-GEN DW. The vendor should use preferably
off-the- shelf tools and not resort to building from
scratch.
Please share sample reports and extracts to propose
most suitable options for migration
Duplicate Query. Refer to Sr.No. #302
828 17 Data transformations should be triggered in
parallel. The NEXT-GEN DW should be capable to
run multiple transformation jobs in parallel. The
NEXT-GEN DW should be able to run at-least 1500
jobs in parallel, scalable up to 5000 in next 5 years,
of varying complexity - simple, medium, complex, in
batch or near real
plz. Define complexity - simple, medium, complex
Share the split % of simple, medium, complex
Plz. Share sample jobs for each type
Duplicate Query. Refer to Sr.No. #303
829 27 User Management: Pt4-The access privileges
associated with each system product, e.g.
operating system, network, database, application
and system utilities, and the users to which these
privileges need to be allocated should be clearly
identified and documented.
Should we assume that the access privileges are to be
assigned to the users directly and managing access to
these privileged accounts is not required?
Duplicate Query. Refer to Sr.No. #304
830 39 User database of 30000+ officials Should we assume approx 30k users would access the
solution with a YoY increase of 10%?
Duplicate Query. Refer to Sr.No. #305
831 30 6 Reporting on all types of available of Data Formats;
· Structured, semi-structured, unstructured
· Click stream data
· Audit Logs
· Documents
· Multimedia data (Images/Videos/Audios)
· XBRL format
· IRIS iFILE framework
Please clarify on the type of database used for the
each of data formats asked for. Or Is it safe to assume
the underline data will be in as per industry specified
relational data format like Oracle , DB2 etc.
Duplicate Query. Refer to Sr.No. #306
832 30 5 Visualizations: BI tools must provide below
different types of visualizations;
· Animations, Barcodes
· Bar, line, pie, area and radar chart types
· Tables, Graphs, Infographics, Filters
· Widgets
· Drag and Drop Creation, Customization
· Templates
· Freehand SQL Command
· Geospatial Integration
· Layouts
· Themes
· Ability to mix and match various combinations
Please elaborate on the definition of
-Animations
-Infographic :what all visualization are you referring to
-Widgets, Templates
Duplicate Query. Refer to Sr.No. #307
833 32 22 In-memory analytics: The product should pull data
into an in-memory or locally cached data store
preferably columnar is an increasingly popular
feature that enables very fast analytics once the
data is loaded.
To achieve the fast analytics, BI tool may adopt
different architecture. BI tool can easily leverage the
in-memory benefits of underline database without
pulling and creating data redundancy and henceforth
reducing the data manageability at BI Layer . Request
you to rephrase the this point as
"In-memory analytics: The product should pull data
into an in-memory or leverage the In-Memory
capabilities of underline database Or locally cached
data store preferably columnar is an increasingly
popular feature that enables very fast analytics once
the data is loaded."
Duplicate Query. Refer to Sr.No. #308
834 32 23 Offline updates: BI tools, when storing copies of
the source data in an online analytical processing
(OLAP) cube or in-memory columnar data store,
should enable business users to schedule
automatic data updates.
Different BI tools have different architecture. BI tool
can easily leverage the capabilities of underline
database. Is this point relevant to the BI Tools whose
architecture is to store the data with BI Server.
Duplicate Query. Refer to Sr.No. #309
835 32 28 Speed of access: Query performance will vary based
on the complexity of the queries and the amount of
data involved. Dashboards with multiple
visualizations will need to get query results from
many queries. The best practice is to create several
prebuilt query scenarios and compare how each
product performs based on these specific
examples. The worse practice is to just arbitrarily
rate the speed.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Duplicate Query. Refer to Sr.No. #310
836 32 29 The best practice is to establish a testing
environment to determine scalability in terms of
both the number of concurrent users and data
metrics, such as volumes, variety and veracity.
These seems to be best practices for implementation
.Please elaborate what is the requirement from BI
Tool
Duplicate Query. Refer to Sr.No. #311
837 32 32 Ability to handle and summarize huge volumes of
data. E.g. 30-40 million rows accessed on index and
summarized over 5 to 8 metrics.
Please elaborate the use case for consumption of 30-
40 million rows from BI. Usually BI tool leverages the
underline database to do summarization of data and
only works on the resulted dataset
Duplicate Query. Refer to Sr.No. #312
838 35 36 The web portal of Business Intelligence tool should
support at-least 25000
concurrent users, scalable up to 75000 in next 5
years, accessing various reports generated
For doing the sizing of Business Intelligence we need
bifurcation of the concurrent users (25000)
Total Concurrent Users : 25000
Number of concurrent active : provide the concurrent
active user count
Number of logged/in-active : provide the
loggedin/active user count
Out of Active Users:
- Users executing BIEE dashboards (having 4/5 reports
or simple charts in a dashboard)
- Users executing large Pivot table operations (25000+
rows)
- Users executing (small to medium sized report - 50K
cells or lower) export to pdf/XL operations
- Users executing very heavy Graphics
Number of Active Concurrent running Extra Large
Reports
(Usually Extra Large Reports are executed off-line
hours)
Duplicate Query. Refer to Sr.No. #313
839 15 point # 16 Proposed solution should be able to scrap
encrypted log, capture Metadata
changes at source level completely, scrapping 4000-
5000 logs daily having log
size of ~ 2 TB each scalable up to 10000 logs.
Proposed solution should be
capable of scrapping logs generated by any type of
Database. E.g. Oracle
Database, IBM DB2 Database etc.
Kindly specify if decryption of logs is also required or
only storage of such logs is fine ? If yes, is there
decryption logic available in specified system ?
Duplicate Query. Refer to Sr.No. #314
840 38 Annexure C -Monthly Data processed
in DWH Warehouse
Archived log extract CBS (SBI) +
TF (SBI)
Are these logs encrypted?
Do we need to keep the RAW logs into the system? Or
only processed logs ?
Duplicate Query. Refer to Sr.No. #315
841 29 # 9 (Data Science Platform with
AI/ML Capabilities)
GPUs to be incorporated in solution if possible
using HDFS Hadoop like
environment for better analytical results
1. Is there requirement to run AI/ML models within
HDFS Hadoop ? Or Expectation is to pull the data into
GPU based analytics workbench and then process.
2. Running AI/ML models within Hadoop is also faster
and Having separate GPU based system for specific AI
models can reduce the cost of GPU based solution.
Please suggest
Duplicate Query. Refer to Sr.No. #316
842 15 # 1 - Data Storage A multi-temperature data management solution to
be proposed by vendor where
data that is frequently accessed on fast
storage—hot data—compared to lessfrequently
accessed data stored on slightly slower
storage—warm data—and
rarely accessed data stored on the slowest storage
—cold data. System should
also be capable automated storage tiering and
seamless data transfer between
hot, warm and cold storage. Data residing in any of
these storage areas must be
seamlessly mixed / merged according to
requirements without impacting
performance.
Kindly share the tentative timeline for Hot/Warm/Cold
data so that we could calculate the size. Example: Hot
Data - 6 months, Warm Data - 1 year, Cold data > 1
year etc.
Duplicate Query. Refer to Sr.No. #317
843 18 18 Transformations for this activity can be categorized
into the following types:
· Existing transformations in DWH that needs to be
migrated to NEXT-GEN DW
Please share the existing transformation details. Duplicate Query. Refer to Sr.No. #318
844 20 1 Data older than specific duration as identified by
Bank to be archived in low cost cold storage.
Changing data archival rules should be easily
configurable. Vendor to propose solution for the
same with cheap and flexible storage and
processing
Please provide Retention period Duplicate Query. Refer to Sr.No. #319
845 20 5 Store backup of entire ecosystem on suitable cost-
effective, fast recovery infrastructure (Currently
tape backup is taken)
Please provide the details of existing Tape Backup
Solution, Backup Window, Backup Throughput and
Restoration throughput
Duplicate Query. Refer to Sr.No. #320
846 13 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Ingestion Sr #2
Data may be structured, semi-structured, and
unstructured. It may come from internal or external
sources. It may come in batches, incremental
additions or real-time feeds. There should be no
limitation on the type, format and size of data
ingested. Data may include log, feeds, audio, video,
image, NOSQL, RDBMS, unstructured text, through
ERP systems, etc
Which RDBMS source data is required to be extracted
in real-time mode? Please provide the source system
name, RDMS type (Oracle/ SQLServer etc) and the
underlying OS
Duplicate Query. Refer to Sr.No. #321
847 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #5
Migration of existing data extraction and reporting
jobs.
Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #322
848 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #6
Migration of monitoring dashboard data points. Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #323
849 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #7
Migration of user details. Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #324
850 19 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements >Migration from
Existing Setup to Proposed Solution
Sr #8
Migration of Data Governance, Data Lineage and
Data Quality rules and policies
Since this is across different products, is this expected
to be semi-automated / manual?
Duplicate Query. Refer to Sr.No. #325
851 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Will these also be flat files? If not, what will be the
interface mode (RDBMS, webServices, API etc) and
which RDBMS?
Duplicate Query. Refer to Sr.No. #326
852 38 Annexure C - Monthly Data
processed in DWH Warehouse Sr #4
Contribution from other source systems like DMAT,
CMP, SBI Life, LOS, etc.
Please provide a count of data sources (best
approximation, and of these how mant will be flat file
sources?
Duplicate Query. Refer to Sr.No. #327
853 23 Annexure B - Technical Criteria/Scope
of Work > Critical Functional
Requirements > Data Quality Sr #12
Mechanism to capture feedback from end users to
report Data Quality issues
Please elaborate. Can this be implemented using
enterprise collaboration tooling / ticket maintenance
system?
Duplicate Query. Refer to Sr.No. #328
854 39 Annexure D - Existing Data
Warehouse Architecture Sr #14
Data Quality What are the existing Data Quality details? How many
and which entity masters are maintained? What is the
current count of each type of Entity and how are their
counts expected to scale up (volumetrics)?
Duplicate Query. Refer to Sr.No. #735