+ All Categories
Home > Documents > State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement -...

State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement -...

Date post: 09-Jun-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
49
Sl. No EOI Page Number EOI Clause Number Existing Clause Query/Suggestion SBI Response 1 11 Annexure A - Eligibility Criteria, Point No 2 The vendor should have existing Next-Gen Data Warehouse the solution as mentioned in the EOI Please clarify, Whether this criterion is for OEM or Bidders as SI. Please refer to Corrigendum. 2 11 Annexure A - Eligibility Criteria, Point No 3 The solution should have been implemented in at least 2 large scale organizations. Documents to be submitted : Two references with the following details for each reference to be provided: 1. Name of the Organization 2. Name of the Official 3. Contact number of Official 4. E-mail Id of Official Please clarify: if the proposed solution as been implemented in the large organization. We as Bidder, can we use the OEM experience and participate the In the RFP. This will help us to qualify and also take the OEMs expertise for successful implementation of the Project. Team Computers is CMMI L3 IT service and solution company with More than 500 Cr turnover with a dedicated focus on BI and Analytics experience. we have Rs 40 Cr + yearly Revenue form Analytics Business out of 500 Cr. The certificate from the CA could be provided to suffice the criteria. We have the capabilities of Building the modern Dataware housing Solution. Please refer to Corrigendum. 3 39 Annex. D ( Sr.No 4) Reporting : 100+Busted Reports 300+ Interactive Reports Query: What is the total number of Users who will consume business intelligence output on the business side Please refer to Hardware specification, point # 35 - The web portal of Business Intelligence tool should support at-least 25000 concurrent users, scalable up to 75000 in next 5 years, accessing various reports generated 4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real time reporting This information is not required at this stage of EOI. 5 32 Critical Functional Requirement - Business Intelligence (Point 32) Ability to handle and summarize huge volumes of data. Query : How do you want vendors to support this capability? Does lab report / outputs work well ? Does this need to be demonstrated? Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 6 31 Critical Functional Requirement - Business Intelligence (Point 21) Mobile Version Query : Do you require access to BI reports and dashboards via native iOS & native Android mobile application? Yes 7 14 7 Existing ETL Jobs to be Fine Tuned What kinds of ELT tools are being used currently? If multiple tools are used, among current 3000+ jobs, how many for different tools approximately? Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 8 14 13 Trigger mechanisms in identifying any structural changes at source Can you help clarify what the structural changes are? DDL changes 9 14 14 The ingestion tools should be able to perform a change data capture on source systems of this nature with run time decompression functionality If the change data capture is in use now, what's the vendor product? How many jobs are running on different CDC tools? Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 10 15 16 scrapping 4000-5000 logs daily having log size of ~ 2 TB each scalable up to 10000 logs. How many source applications are involved approximately? This is a solution discovery phase. Hence we are asking for best possible technologies to give best performance in proposed solution. Bidders are free to propose suitable solution to meet the requirements of this EOI. 11 15 17 Solution should be able to handle DDL change without manual reorg/runstat. Could you please provide use case about DDL change? This is an industry standard concept 12 15 18 A job scheduler, along with process management controls that provide things like runtime monitoring and error alerting, handling, and logging. Is there any job scheduler used currently? If yes, what's it? If more that one, please list as well. Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 13 16 12 Downstream departments (data consumer) to be given separate processing power, storage to undertake their requirements with separate DB snapshot How many downstream departments might be engaged for DB snapshot approximately? Are they branch IT or business department? This information is not required at this stage of EOI. 14 17 12 ETL/ELT tool for data extraction should be AI/ML features for suggesting / improving Query / ETL / ELT Stages Could you help elaborate some use case of AI/ML featuer? Do you know any ETL tool with AI/ML featuer? Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 15 18 16 Reports on job status and success / failure / retrigger should be sent to concerned stakeholders on a continuous basis Through which channels reports are sent out, email, SMS, or some real-time monitoring dashboard? Please help clarify. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 16 18 19 Workflow management tool(s) should have connectors / pluggable interfaces to already existing / in-use proprietary software available with the Bank. These could be (and not restricted to) data repositories, reporting tools, data analysis tools and generic interfaces for data transfer. What are these exact products of vendors of 'data repositories, reporting tools, data analysis tools and generic interfaces for data transfer'? These are industry standard capabilities 17 18 3 Data migration from existing archival solution to new one. What's the current existing archival solution? Archived to tape lib ? Please help clarify. Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. Data Archival is not on tapes and current archival solution is similar to production environment and accessible to end users. 18 19 6 Migration of monitoring dashboard data points. What is current dashboard product being used now? It's an enterprise-level dashborad and will continue to use, isn't it? Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. Bank will take a final call at appropriate time. 19 19 8 Migration of Data Governance, Data Lineage and Data Quality rules and policies Is there any automation tool used for governance, lineage, quality etc? What are they? Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 20 19 9 history of version control, existing tape backup What does it mean of 'version control'? It's historical version of programme / storedproc or a version control tool? This is an industry standard terminology 21 19 9 history of version control, existing tape backup Does it mean 'tape backup' migrates to data lake alike? Will the historical tape backup be part of the migration scope or leave the historical as it is, just take care of these nowards? Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. Bank will take a final call at appropriate time. 22 19 12 Vendor should provide a feasible plan for best use of existing infrastructure which is procured during last 10 years Could you please share the software product stack and proprietary or open hardware specifics of existing architecture? Currently we are using IBM Stack. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. 23 19 3 The NEXT-GEN DW – DR solution needs to be set up at a remote location at Hyderabad. What's the distance between PROD site and DR site? What's the network bandwidth between? This information is not required at this stage of EOI. Bidders are free to propose solution(s) in the best interest of the Bank to meet the requirements given in the EOI. Responses to queries raised by Bidders in REQUEST FOR EXPRESSION OF INTEREST (EOI) FOR “NEXT-GEN DATA WAREHOUSE SOLUTION”. Request for EOI No.: SBI/GITC/Data Warehouse/2018/2019/34 Dated: 23.03.2019
Transcript
Page 1: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

Sl. No EOI Page

Number

EOI Clause Number Existing Clause Query/Suggestion SBI Response

1 11 Annexure A - Eligibility Criteria, Point

No 2

The vendor should have existing

Next-Gen Data Warehouse

the solution as mentioned in the

EOI

Please clarify, Whether this criterion is for OEM or

Bidders as SI.

Please refer to Corrigendum.

2 11 Annexure A - Eligibility Criteria, Point

No 3

The solution should have been

implemented in at least 2 large

scale organizations.

Documents to be submitted :

Two references with the following

details for each reference to

be provided:

1. Name of the

Organization

2. Name of the Official

3. Contact number of

Official

4. E-mail Id of Official

Please clarify: if the proposed solution as been

implemented in the large organization. We as Bidder,

can we use the OEM experience and participate the In

the RFP. This will help us to qualify and also take the

OEMs expertise for successful implementation of the

Project.

Team Computers is CMMI L3 IT service and solution

company with More than 500 Cr turnover with a

dedicated focus on BI and Analytics experience. we

have Rs 40 Cr + yearly Revenue form Analytics

Business out of 500 Cr. The certificate from the CA

could be provided to suffice the criteria. We have the

capabilities of Building the modern Dataware housing

Solution.

Please refer to Corrigendum.

3 39 Annex. D ( Sr.No 4) Reporting : 100+Busted Reports

300+ Interactive Reports

Query: What is the total number of Users who will

consume business intelligence output on the business

side

Please refer to Hardware specification, point # 35 - The web portal of

Business Intelligence tool should support at-least 25000 concurrent users,

scalable up to 75000 in next 5 years, accessing various reports generated

4 20 Critical Functional Requirement -

Monitoring Dashboard # 1

Real Time data flow in dashboard Query: What is the SLA on Real time reporting This information is not required at this stage of EOI.

5 32 Critical Functional Requirement -

Business Intelligence (Point 32)

Ability to handle and summarize huge volumes of

data.

Query : How do you want vendors to support this

capability? Does lab report / outputs work well ? Does

this need to be demonstrated?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

6 31 Critical Functional Requirement -

Business Intelligence (Point 21)

Mobile Version Query : Do you require access to BI reports and

dashboards via native iOS & native Android mobile

application?

Yes

7 14 7 Existing ETL Jobs to be Fine Tuned What kinds of ELT tools are being used currently? If

multiple tools are used, among current 3000+ jobs,

how many for different tools approximately?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

8 14 13 Trigger mechanisms in identifying any structural

changes at source

Can you help clarify what the structural changes are? DDL changes

9 14 14 The ingestion tools should be able to perform a

change data capture on source systems of this

nature with run time decompression functionality

If the change data capture is in use now, what's the

vendor product? How many jobs are running on

different CDC tools?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

10 15 16 scrapping 4000-5000 logs daily having log size of ~

2 TB each scalable up to 10000 logs.

How many source applications are involved

approximately?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

11 15 17 Solution should be able to handle DDL change

without manual reorg/runstat.

Could you please provide use case about DDL change? This is an industry standard concept

12 15 18 A job scheduler, along with process management

controls that provide things like runtime

monitoring and error alerting, handling, and

logging.

Is there any job scheduler used currently? If yes,

what's it? If more that one, please list as well.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

13 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot

How many downstream departments might be

engaged for DB snapshot approximately? Are they

branch IT or business department?

This information is not required at this stage of EOI.

14 17 12 ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Could you help elaborate some use case of AI/ML

featuer? Do you know any ETL tool with AI/ML

featuer?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

15 18 16 Reports on job status and success / failure /

retrigger should be sent to concerned stakeholders

on a continuous basis

Through which channels reports are sent out, email,

SMS, or some real-time monitoring dashboard? Please

help clarify.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

16 18 19 Workflow management tool(s) should have

connectors / pluggable interfaces to already

existing / in-use proprietary software available with

the Bank. These could be (and not restricted to)

data repositories, reporting tools, data analysis

tools and generic interfaces for data transfer.

What are these exact products of vendors of 'data

repositories, reporting tools, data analysis tools and

generic interfaces for data transfer'?

These are industry standard capabilities

17 18 3 Data migration from existing archival solution to

new one.

What's the current existing archival solution? Archived

to tape lib ? Please help clarify.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI. Data

Archival is not on tapes and current archival solution is similar to production

environment and accessible to end users.

18 19 6 Migration of monitoring dashboard data points. What is current dashboard product being used now?

It's an enterprise-level dashborad and will continue to

use, isn't it?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI. Bank

will take a final call at appropriate time.

19 19 8 Migration of Data Governance, Data Lineage and

Data Quality rules and policies

Is there any automation tool used for governance,

lineage, quality etc? What are they?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

20 19 9 history of version control, existing tape backup What does it mean of 'version control'? It's historical

version of programme / storedproc or a version

control tool?

This is an industry standard terminology

21 19 9 history of version control, existing tape backup Does it mean 'tape backup' migrates to data lake

alike? Will the historical tape backup be part of the

migration scope or leave the historical as it is, just take

care of these nowards?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

22 19 12 Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years

Could you please share the software product stack and

proprietary or open hardware specifics of existing

architecture?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

23 19 3 The NEXT-GEN DW – DR solution needs to be set

up at a remote location at Hyderabad.

What's the distance between PROD site and DR site?

What's the network bandwidth between?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

Responses to queries raised by Bidders in REQUEST FOR EXPRESSION OF INTEREST (EOI) FOR “NEXT-GEN DATA WAREHOUSE SOLUTION”. Request for EOI No.: SBI/GITC/Data Warehouse/2018/2019/34 Dated: 23.03.2019

Page 2: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

24 20 3 The architecture of NEXT-GEN DW should enable

interaction between public cloud and designated

edge servers alone.

Could you please elaborate with some detailed use

case?

All the components of Next-Gen DW should not directly connect to public

cloud. A secure designated edge servers to be proposed by Bidder for any

data transfer between Next-Gen DW and public cloud.

25 21 16 Back Dated Data changes needs to be updated on

portal

Could you please clarify with some case of Back Dated

Data change?

Monitoring dashboard should have capability to showcase back-dated data

changes

26 22 3 Management of referential integrity Could you please clarify with some case of referential

integrity?

This is an industry standard terminology

27 23 7 Recommend Enrichment — Enhancing the value of

internally held data by appending related attributes

from external sources

Could you please give an exmaple? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

28 23 10 Identity resolution - Identity resolution is the

process of linking various records and is the main

engine for record de-duplication, which can enable

some aspects of data cleansing.

Could you please give an exmaple? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

29 24 1 Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups.

Are you looking for a standalone AIM specific for this

NEXT-GEN DW solution? Or there is already some

centralized AIM at enterprise-level, this new-

introduced AIM of this solution is required to integrate

or sync with this enterprise-level one? Please help

clarify.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

30 25 1 Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Does it mean a new regulatory reporting

framework/application is requested to replace the

existing one? Or just keep this regulatory reporting

framework/application but migrate the datastore to

NEXT-GEN DW? What is the current framework, from

which product vendor?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

31 28 5 Self-service portal to extract the data on their own

(Should support Data Democratization)

Is there any approval procedure, evaluation of

sharability, or masking/transformation prior to

extraction self-service and any data destory procedure

afterwards? Who are the main requestor and waht's

the frequency of request? Please help clarify.

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

32 28 1 Implementing end to end analytics use-cases as

mandated by the Bank

Could you list all the analytics use-case or some typical

case?

Please refer Annexure E

33 28 2 Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS).

Please advise the exact version of IBM SPSS Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

34 30 1 Capability to connect to various data sources Is there any BI tool being used at this moment? If yes,

what is it and the version?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

35 39 1 Database appliance what is the product of the existing applicance? E.g.

Teradata, Exadata?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

36 39 3 Scrapping of Jobs from Source What kind of scrapping tools are used currently? In

this solution, is there any preference to reuse these

tools?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

37 39 3 Scrapping of daily 5TB + logs What kind of logs here? Are they database logs? If yes,

what are all these database products, e.g. Oracle, DB2,

SQLServer, MySQL etc?

Database Logs. Bidder proposed solution should have capabilitties to scrape

from any kind of database products.

38 39 4 Reporting Any reporting tool is used? E.g. Tableau etc. Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

39 39 6 Hardware Monitoring What is the Monitoring product and version? Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

40 39 6 Server memory and compute monitoring of total

80+ production servers

What type of these production servers, including

applicance or just commondity x86 servers? What

typical spec are they? How many years have they been

used?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

41 39 7 User database of 30000+ officials How many different types of these users? How many

data analysts, which can write and perform ad hoc SQL

query?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

42 39 8 Job Scheduling What is the current scheduling tool being used

currently?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

43 39 18 10TB Oracle database What version is used? Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

44 48 SCAPM IBM Smart Cloud What kind of applications are running on that? What's

the interaction with NEXT-GEN DW solution?

Please refer to Corrigendum.

45 40-43 Jan-32 Functional Use Cases 1) Are these used cases based on existing solutions

with SBI(eg. Risk, Online Fraud

Prediction/management, AML etc.) OR are such

solutions to be proposed by the vendor as part of the

Next Gen Warehouse initiative? In that case, will the

bank provide additional information to size/scope

these solutions?

2) If these are existing solutions which will leverage

the Next Gen warehouse, then will it be the bidders

responsibility to integrate these solutions to the

warehouse? If so, can the bank please share the

details of all these solutions that need to be

integrated.

1. These are sample use cases build for execution on Next Gen DWH platform

over the period of time for Analytical studies. Bank at its own discretion will

implement new models/use cases on this setup in future.

2. This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

46 11 Eligibility Criteria, Clause 2 Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Does the vendor need to be OEM for all the solution

components or the solution stack may be a composite

of other OEM/ Third party solutions ?

Is a consortium allowed for the complete solution,

services and implementation? OR, is SBI expecting a

single point bidder and implementer?

Please refer to Corrigendum.

Page 3: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

47 32 Hardware Specifications,1 2 The Vendor is required to supply, install, test,

commission, monitor, manage and maintain the IT

System along with operating system and other

peripherals with one-year warranty and AMC for 4

years from the date of delivery at data centers

advised by the B

Is the bidder allowed to have alliances and

subcontracting for the maintenance?

Yes, such engagements will be guided by certain set of rules which Bank will

publish at later stage.

48 13 Data Ingestion, 1 Capable of ingesting data from any source system

in automated manner currently implemented in the

Bank, or any future standard source systems that

the Bank will decide to use with high throughput

and low latency. Vendor to propose performance

benchmarking

Will the details of all categories of ingest jobs (sources

types/ frequencies/ loads/ etc) be made available - at

least at time of RFP? ""Any future standard sources

Systems"" - can you provide few examples of such

future systems?

This information is not required at this stage of EOI.

49 15 Data Storage, 1 Vendor should propose effective number of data

storage layers in NEXT-GEN DW between data

ingestion and data consumption.

The number of data storage layers also depends on

the used cases for the implemented data e.g. based on

latency, access levels , grain etc . Is it adequate if the

solution capability in this regard explained with a

couple of examples, while there may be other

constructs as well ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

50 17 Data Processing, 14 Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying

The given metric is about only the counts of parallel

jobs. However processor, memory and compute

resources are also determined by the size and

complexity of the jobs, and the time window to

conclude. We would require additional details in the

count metric provided for sizing/ architecture

recommendations. Can the bank share details of the

complexity mix & matrix and the definitions thereof.

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

51 18 Migration from existing Setup, 1 Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

Requesting the bank to share the existing architecture

details which will be required to address this point in

the EOI

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

52 32 Hardware Specifications, 2 Software and Solution proposed by vendors should

be compatible with all types of Hardware.

"All types of Hardware" is very generic. Software

components for such specialized job as mentioned in

the EOI, would need specific kinds of hardware. Could

you pls explain what is the purpose behind asking for

this requirement?

Are you looking at open source as your first choice for

both hardware & software?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

53 3 Introduction Please note, the objective of this Request for EOI is

to identify all possible solution (s) for the scope of

work defined in this document.

Please clarify if there will be any shortlisting of

vendors or solutions based on this EOI evaluation.

This information is not required at this stage of EOI.

54 11 Annexure A -2 Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Please clarify if only the OEM's having the solution

capabilities as required can participate in this EOI or

the OEM's can also participate with their System

Integration Partners.

Please refer to Corrigendum.

55 40 Annexure E Customer Analytics- Up-sell Is the same to be done by making analytical model for

the same or the bank is open for the pin pointed

solution for the same lilke Analytical CRM

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

56 13 Annexure B Data Ingestion What are the current systems from where the data

needs to be ingested

This information is not required at this stage of EOI.

57 17 Annexure B Data Processing Framework Point 1 - For the data

to be accessible and consumable by businesses /

downstream applications, the NEXT-GEN DW

should have robust, highly efficient and parallel

execution of data transformation jobs.

What are the downstream applications referred herein

This information is not required at this stage of EOI.

58 23 Annexure B Data Quality - Vendor should propose end-to-end

solution for Data Quality Management starting

from data origin till the data consumption. These

tool (s) to be used for addressing various aspects of

the data quality problem mentioned below on SBI

data set during data ingestion, data processing or

data consumption as advised by Bank on case by

case basis

Is there any DQ tool being used currently

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

59 25 Annexure B Regulatory Reporting - Vendor should follow the

RBI guideline in developing the solution with which

it will be easier for the Bank to migrate to the

element-based data reporting envisaged by the RBI

Is the element on a transaction level or entity level

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

60 28 Annexure B Data Science platform with AI/ML Capability -

Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS). Migration of

such models to new solution

Which models are referred to herein

This information is not required at this stage of EOI.

61 29 Annexure B Data Science platform with AI/ML Capability -

Vendor to provide solution / tool (s) for below

scope of activities on SBI data sets;

- Benchmarking

- Predictive & Prescriptive Analytics

- Social Media Analytics

- Web Analytics

- Geolocation Analysis

- Ad-Hoc Analysis

- Trend Indicators

- Profit Analysis

- In-Memory Analysis

- Statistic Analytics

- Data Mining"

Please calrify if the bank is using any tool for

Geolocation currently

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

62 31 Annexure B Business Intelligence Tools - Mobile version: BI

tools should be able to differentiate between

viewing BI applications on a web browser on a

mobile device versus a mobile BI application

a) Please clarify if there is there a requirement to

view dashboards on mobile b) Please

clarify if there is a requirement to view any other data

point also on mobile device

Yes

Page 4: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

63 42 Annexure E - Point 20 Organized Fraud/ Collusion Detection - Fraud/AML

Practice - Analysis and identification of factors

causing Fraud/AML practice

a) Please clarify if the bank requires capability to

detect burst out frauds in corporate asset products,

retail asset products and liability products?

b) Please clarify if the bank requires capability to

identify the suspicious linkages based on demographic

and transaction to identify the collusion?

c) Please clarify if the bank requires capability to ready

to analytical methodologies to reduce false positives

in transaction monitoring in AML and Fraud areas

d) Please clarify if the bank requires capability of

building networks based on demographic and

transaction linkages?

e) Please clarify if the bank requires capability to

identify new patterns / new modus operendi in AML /

Fraud areas?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

64 42 Annexure E - Point 21 Opportunistic Fraud Detection - Identification of

Fraud and its prevention - Identify and predict

potential fraud across the Bank (I e. Cheque fraud,

Remittance fraud, Card fraud, Online fraud, etc)

Please clarify if the bank requires capability to build

and refine the fraud detection models at regular

intervals to ensure the model learns based on the

latest collateral / fraud tagged data?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

65 42 Annexure E- Point 22 AML detection and alert management -

Identification of Money Laundering activities and

its prevention - Identification of money laundering

activities based on account transaction and

behavior

a) Please clarify if the bank requires capability of

identifying the shell company accounts, money mules

which are used as intermediaries for layering?

b) Please clarify if the bank requires capability of an

integrated modelling and deployment framework to

develop and test models for financial crimes.

c) Please clarify if the bank requires capability of

developing alert scoring models to identify the high

risk alerts?

d) Please clarify if the bank requires capability related

to analytical suppression of AML alerts to ensure to

reduce the false positives

e) Please clarify if the bank requires capability related

to customer, account, product, transaction type etc.

scoring to contribute to alert scoring for AML.

f) Please clarify if the bank requires capability to

identify complex layering techniques involving large

number of accounts?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

66 43 Annexure E- Point 32 Transaction Fraud Prediction - Identification of

Suspicious transactions on Real Time Basis for all

digital channels - Identification of Suspicious

transactions on Real Time Basis for all digital

channels

a) Please clarify if the bank requires capability related

to developing models for scoring the digital

transactions based on supervised and un-supervised

techniques?

b) Please clarify if the bank envisages deploying model

in any existing fraud transaction monitoring tool ?

c) Please clarify if the bank intends to procure the real

time fraud prevention tool as part of this

evaluation/initiative?

d) Does the bank expect to include an intelligent

investigation tool to assess the results of new rules /

models

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

67 40 Annexure E- Point 2 Customers may have the same or a similar product

but might be very different in profitability and

marketing efforts can leverage this information to

sell a higher margin or higher value product to a

profitable customer. An upsell model evaluates this

insight at a customer level

Currently, Is there a way of tracking customer

profitability which can aid in deriving insights for

upsell and cross sell. If yes, How is this information

tracked

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

68 40 Annexure E- Point 3 Various campaigns are run to target existing

customers for cross-selling other products. Budgets

allocated for running such campaigns are limited.

Thus, models to improve response rate to these

cross-sell campaigns are significant. Higher product

penetration per customer also discourages

customer attrition, improves customer loyalty

Are there any specific channels that are being used

currently for cross-sell campaigns and do we have the

information pertaining to the responses from the

specific channels

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

69 40 Annexure E- Point 10 Identify internal & external factors affecting lead

conversion and forecast the factors for upcoming

quarters

Currently, does the bank have a mechanism to track

lead conversions and the data needed for identifying

the forecast factors

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

70 19 Disaster Recovery : 3 The NEXT-GEN DW – DR solution needs to be set

up at a remote location at Hyderabad.

Please Clarify if Backup solution is required in DR Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

71 20 Data Archival & Backup : 2 Data Archival solution should not be visible to end

user, but Archived data should be available for all

end users. For end user it should be a single view

with Data Federation/Virtualization Layer

What is the Directory services used by SBI. Is it Active

Directory or any other directory services.

Active Directory

72 20 Data Archival & Backup :5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

What is the current backup solution? Is the migration

of old backup in scope? If Yes, what is the size and no.

of Tapes(LTO Versions)

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

73 20 Data Archival & Backup :7 Archival and Backup setup must support automated

Data Reconciliation whenever movement from

Current/Live happens

Archival & Backup are both different solutions

meeting different set of requirement. Both

Compliments each other. e.g. Archival will reduce the

Backup footprint. There won't be any Automated

Data Reconcillation. Request Bank to remove this

point.

No change in standard clause of EOI

74 20 Cloud Integration and Migration :1 NEXT-GEN DW should be able to consume data

from external cloud-based infrastructures.

Please suggest of there are any specific cloud

providers. Any Specific Cloud providers?

This information is not required at this stage of EOI.

75 6 Downstream Data Consumption -28 Dedicated high-performance department wise

sandboxes allocated to end users for R&D

Sanbox can be spinned up/down based on the

need(instead of having a dedicated one) , this will

ensure that storage and compute are not locked and

optimally used.Request you to consider modification

of this requirement.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 5: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

76 13 Annex B Real time Data ingestion with spontaneous

reconciliation

Pl explain 'spontaneous reconciliation' with example Spontaneous means real time reconciliation over here.

77 14 Critical Functional Requirement -

Data Ingestion #8

Existing ETL Jobs to be Fine Tuned Is the rationalization of ETL jobs also expected Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

78 15 16 Proposed solution should be able to scrap

encrypted log, capture Metadata changes at source

level completely, scrapping 4000-5000 logs daily

having log size of ~ 2 TB each scalable up to 10000

logs. Proposed solution should be capable of

scrapping logs generated by any type of Database.

E.g. Oracle Database, IBM DB2 Database etc.

Please reconfirm if the log size is 2TB, because that

would imply that the total size of daily logs is 10000 *

2TB = 20000TB

Please refer to Corrigendum.

79 15 16 Proposed solution should be able to scrap

encrypted log, capture Metadata changes at source

level completely, scrapping 4000-5000 logs daily

having log size of ~ 2 TB each scalable up to 10000

logs. Proposed solution should be capable of

scrapping logs generated by any type of Database.

E.g. Oracle Database, IBM DB2 Database etc.

Our suggestion is that it would be appropriate for the

bank to specify the anticipated source types to be

supported for log scraping rather than 'any type of

database'

No change in standard clause of EOI

80 15 17 Solution should be able to handle DDL change

without manual reorg/runstat. It should handle

network fluctuations and hindrances.

Request you to please elaborate on this requirement

and specify which solution component or solution

capability this refers to, DDL changes to which

database, and the nature of hindrances being referred

to

DDL changes in source database.

Nature of hindrances includes all possible failures

81 15 Critical Functional Requirement -

Data Ingestion #8

Solution should be able to handle DDL change

without manual reorg/runstat. It should handle

network fluctuations and hindrances.

Is it DDL change of source system or the DDL change in

Datawarehouse? Pl provide example of Network

Malfucntions and hindrances

DDL changes in source database.

Nature of hindrances includes all possible failures

82 15 3 A multi-temperature data management solution to

be proposed by vendor where data that is

frequently accessed on fast storage—hot

data—compared to less-frequently accessed data

stored on slightly slower storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should also be capable

automated storage tiering and seamless data

transfer between hot, warm and cold storage. Data

residing in any of these storage areas must be

seamlessly mixed / merged according to

requirements without impacting performance.

1. We believe the bank would want multi-temperature

data management to be within the storage system as

well as across the storage system. Please confirm.

2. Please confirm what will be the data movement

trigger criteria. E.g. Age of file and/or access frequency

of file etc.

3. Please confirm if the understanding of tiers vis a vis

storage technology is correct: -

Tier 1 - Hot Data - Flash Storage

Tier 2 - Warm Data - Spinning Disks (NLSAS) based

storage

Tier 3 - Cold Daya - Tape Storage

4. Please confirm what could be the likely

performance requirement in terms of latency to be

provisioned from storage system?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

83 16 4 Storage replication (e.g. RAID) should be

automatically managed by the platform.

We believe storage replication (RAID) technology is

desired for protecting data within the system from

disk failures. Also, please confirm if the RAID feature

should be highly resilent with minimum dual disk

protection.

Please confirm if there is a needs to replicate data

using storage based replication meholodoly across

across another site (e.g. DR site) or application level

replication is also acceptable

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

84 16 5 Tool should have capability to store/swap data in

memory, disk and distributed storage areas

depending on the age of the data determined by its

usages through user queries

Please confirm if distributed storage area requires a

distributed file system based scale out storage which

can grow in performance and capacity linearly as per

the system demands and still provides a single global

namespace to DWH application.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

85 16 6 User should be able to work on DB even while

backup is in progress. They should be able to run

statistics and reorganize their tables. Any

background process including backup must not

hamper performance of user queries.

Please share the expected backup size and backup

window in the expected system

Bidders to propose Backup solution(s) in view of the sizing given in Annexure

G

86 16 8 Storage should support data compression. It should

be possible to perform both fast compression and

efficient compression based on data processing

needs.

Compression is a CPU taxing job. Majority of the large

scale storage systems are designed to provide a single

compression algorithm which is intelligent enough to

identify which data can be compressed efficiently and

fast.

For data which cannot be compressed (e.g images or

videos etc.) or achieves lower compression ratio is

skipped so as to preserve the CPU cycles and thereby

the performance of the overall system.

It is requested to drop this clause from the storage

specifications

No change in standard clause of EOI

87 16 10 Ensuring real time health checks, monitoring and

alerting about data storage / utilization of storage /

failure handling of storage components. Actionable

dashboard must be available to designated users to

monitor health checks and tool should

automatically issues alerts to users.

Some errors in storage systems go unnoticed, without

being detected by the disk firmware or the host

operating system; these errors are known as silent

data corruption.

Please confirm if the storage should also have the

feature to protect from silent data corruption and end

to end file integrity features

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

88 16 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Storage system performance are governed by

IOPS/MBps and latency. We request bank to help in

understanding how many IOPS per transaction is

expected and at what block size

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

Page 6: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

89 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases. Construction of this

separate Database snapshot and enabling this audit

trails must not cause any major systemic

issues/challenges in smooth functioning of primary

DB.

Please confirm if the downstream departments need

to have dedicated storage environment or can use the

same central storage using a dedicated volume from

the storage

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

90 19 Critical Functional

Requirements:Disaster Recovery

The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

What is the expected RPO ? Bidders to provide their best RTO/RPO for solution (s) proposed.

91 20 1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

The cheapest storage available is tape. We assume

that the archival system needs to ensure that the data

archived to low cost storage is still part of the same

filesystem. Please confirm.

Also, kindly mention the expected frequency of data

archival

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

92 20 5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

As the production data will be in disk or flash based

system, we understand that the tape based backup is

still acceptable to Bank as long as the tape library

offers the high availability using dual robotics and

provides scheduled automatic media verifictaion.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

93 20 6 Mixing and Merging data from Current/Live to and

from Archival must not result in any significant loss

of performance and response time

Any change required on archived data requires the

data to be moved back to the faster tier from lower

tier first. This shouldn't result in significant

performance or response time loss. However, we

would suggest a minimum throughput to be asked

from tape archival system for both archival and

retrieval.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

94 20 2 Cloud integration / data transfer to and from

public/private/hybrid cloud should be available

using all standard protocols. (Web requests /

secure transfer channels etc)

Please confirm if there is a need to archive data to a

public cloud storage from the production storage.

At present, as per the Bank's IS policy migrating/storing data in public cloud is

not permitted. However bidders may propose, as an alternative, use of cloud

(public cloud, private cloud, on-premise etc) in addition to the best integrated

proposed solution.

95 21 2 Provide trust – The system should be able to

ensure the users that they are accessing data from

the right source of information. 

Please elaborate. The authenticated access

mechanisms and user role group matrix ensure this. Is

there any other trust level that needs to be genrated

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

96 21 3 Provide auditability – the solution should record

any access to the data to satisfy compliance audits.

For example, it should be able to check on who

touched the data, when did they touch it, is there a

chain of custody issue, is there transparency in

terms of data privacy and protection etc. 

What is the duration of the user activity logs / queries

that would need to be stored ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

97 21 4 Enforce security and privacy – Data inside the NEXT-

GEN DW will be accessed by only authorized users.

Data at rest/in-motion should be encrypted 

What is the level of encryption to be achieved. (356,

2048)

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

98 21 5 Capability to classify and store (personal

identifiable information) sensitive data in

encrypted /masked form and should have capability

to decrypt/unmask such information in NEXT-GEN

DW when required by only authorized ID’s.

Will the bank provide the classification tables, or a

workshop has to be conducted to arrive at the PII

data. Level of encryption is also required (256 or 2048,

etc)

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

99 22 14 Parallel processing: Data governance tool should be

able to handle 500 concurrent users, scalable up to

1000 users in next 5 years, running any kind of job

(eg: Data Lineage on simple/medium/complex jobs

running on multiple tables)

We understand this this concurrent jobs are by way of

queries relating to information from the catalog and

related to data lineage. Please confirm.

Yes

100 22 14 Parallel processing: Data governance tool should be

able to handle 500 concurrent users, scalable up to

1000 users in next 5 years, running any kind of job

(eg: Data Lineage on simple/medium/complex jobs

running on multiple tables)

Please clarify if the definition of concurrent users

means concurrently logged into the platform or

concurrently executing a job/ query. (The latter is a

fraction of the former)

Concurrently executing a job/ query.

101 22 13 Masterdata Management Capability: Master Data

Management tool (s) should deliver consolidated,

complete and accurate view of business-critical

master information to all the operational and

analytical systems across the Bank. 

Please specify whether bank has implemented Master

data management solution earlier across SBI. If yes

please provide the details of the same

At present, DWH doesn’t have MDM solution

102 23 8  Security features: - to help in protecting the

information contained in the data dictionary. 

Kindly elaborate the security features

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

103 24 Security and Compliance - 1 Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups. It should be

possible to quickly create and revoke the identity of

a user or a service by simply deleting or disabling

the account in the directory. Multi-factor

authentication is desired as an additional layer of

security for user sign-in and transactions.

There are requirement on Identity and Access

Management as well as User access management

administration.

Do we have to propose and provide Identity and

Access Management (IAM) solution?

Need clarification on if we should integrate with an

existing IAM or do we need IAM solution to be

implemented for NextGen DW.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

Page 7: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

104 24 1  Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups. It should be

possible to quickly create and revoke the identity of

a user or a service by simply deleting or disabling

the account in the directory. Multi-factor

authentication is desired as an additional layer of

security for user sign-in and transactions. 

Would the SI be required to use the Bank's

Authentiction and Identity Management Systems, or

integrate with the Bank's PIM etc.

Please provide any suitable details reagrding the IAM

modules of the Bank.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

105 25 Security and Compliance - 6 Data Leakage - Security CIA parameters should be

achieved, and tools should be able to find and alert

on Data leakage

Are there existing Data Leakage (Loss) prevention

tools which can be leveraged?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

106 25 Security and Compliance - 8 Compliance to Global Standards Among the standards provided, Can we get the list of

standards we can consider while scoping?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

107 25 8 Compliance to Global Standards , GDPR, BCBS239,

PCIDSS, DFRA and similar relevant standards 

Please specify whether bank has implemented and

GDPR, BCBS239, PCIDSS,DFRA solutions in part. Please

provide further details on these requirements as this is

very broad and comprehensive area of solutioning

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

108 25 5  Data Democratization - Secure access of PROD

Database to LHOs/GOCs 

Please elaborate on the levels of access to the LHO

and GOCs.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

109 25 6  Data Leakage - Security CIA parameters should be

achieved, and tools should be able to find and alert

on Data leakage 

Does the Bank have any DLP in place or is the SI

expected to implement a DLP in parallel.

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

110 25 8  Compliance to Global Standards 

GDPR, BCBS239, PCIDSS, DFRA and similar relevant

standards 

Office of Foreign Assets Control (OFAC) 

Financial Crimes Enforcement Network (FinCEN) 

Securities and Exchange Commission (SEC) 

Office of the Comptroller of the Currency (OCC)

etc. 

Request to filter out the standards relevant to the

Local Indian Standards. Need to classify the

International Branch locations and then apply the

respective regulations local to that nation. Does the

Bank expect another classification in the Data Lake

based on the Country of Origin and respective

Regulation.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

111 26 1  Logging of operational activities: Support the

logging of all user activities without slowing down

the performance. 

Log Rentention Periods are required. Based on the

Logs estimate the Storage requirements are to be

calculated. Similarly the retrieval speed of the

logs/audit trail will set the media physical attributes

like flash/etc…

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

112 26 2  Should have metadata enabled reporting

mechanism on run time log. 

Does the Bank expect a separate Analytical Model to

be built for User Logs for audit trail.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

113 26 4  Extracting useful information from system logs to

understand the efficiency of system and any fraud 

Extraction time and query parameters need to be

arrived at. Does the bank have any such parameters

for Fraud which they can pass on to the SI so that the

query can be built

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

114 26 6  Vendor to propose solution with cheap storage

options for log storage of all the Bank’s applications

and mechanisms to extract requested information

from the logs as and when required 

Storage cost would depend on the speed of the query

results. Bank to provide the parameters.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

115 28 Data Science Platform with AI/ML

Capabilities (Sl No.2)

Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS). Migration of

such models to new solution

How many such models in IBM SPSS are we expected

to migrate? Also do you continue to use the same

model or rebuild them on new platform if required?

This information is not required at this stage of EOI.

116 28 Data Science Platform with AI/ML

Capabilities (Sl No.1)

Implementing end to end analytics use-cases as

mandated by the Bank

How are these analytics usecases getting

deployed/consumed/used(esp realtime)? Though

Online platforms/Mobile Apps/CBS/YONO etc?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

117 28 Data Science Platform with AI/ML

Capabilities (Sl No.3)

Availability Pre-build models which can be directly

used with Bank’s data to get insights

What are the kind of prebuild models are we

expecting?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

118 29 Data Science Platform with AI/ML

Capabilities (Sl No.15)

Build and Publish detailed reports, insights on web

portal

What kind of reports are we talking about, model

performance or output based reporting

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

119 29 Data Science Platform with AI/ML

Capabilities (Sl No.19)

Integration with R, Python, Keras,

Tensorflow,Theano, scikit-learn etc and other

frameworks / languages

Do we have any models deployed on the Open source

softwares such as R/Python etc.

If Yes, how many such models are needed to be

migrated? Also counts by Softwares used (ex: how

many in R/Python etc)

If No, does the bank have suitable policies around

Open Source Analytics Software usage? Which of the

Open Source Softwares are approved?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

120 29 Data Science Platform with AI/ML

Capabilities (Sl No.21)

Support on demand basis Please clarify what "Support on demand basis" implies

? Would this be AMS alone or support shall also

include training .

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

121 29 Data Science Platform with AI/ML

Capabilities (Sl No. 13)

Access management at the data, workflow and

models

Please clarify on how many users are expected to use

the platform?

Please refer Hardware Specification subsection, point number #34 on page

#35

122 30 6 Reporting on all types of available of Data Formats; Please share additional details on the nature of

reporting required on unstructured data, click stream

data, images, videos, audios etc.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

123 32 Critical Functional Requirements:

Hardware Specifications

The proposed solution envisages use of commodity

hardware, and if any proprietary components are

used, should be listed in the response with details

and justification.

Need Clarity if SBI looks for any specifc platform here

like Power or Intel or its upto Bider to decide?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 8: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

124 32 3 The proposed solution envisages use of commodity

hardware, and if any proprietary components are

used, should be listed in the response with details

and justification

While this point envisages the use of commodity

hardware, the next point i.e. point 4 states that

vendor proposed hardware is expected to be

enterprise class and best of breed. Kindly provide

more clarity on this ask. Typically, we have seen banks

go for commodity hardware for the data lake, but best

of breed, enterprise class, non-commodity hardware

for the Data Warehouse

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

125 33 Critical Functional Requirements:

Hardware Specifications

The Vendor is required to supply, install, test,

commission, monitor, manage and maintain the IT

System along with operating system and other

peripherals with one-year warranty and AMC for 4

years from the date of delivery at data centers

advised by the Bank

Need Clarity if BAU support (day2 operation support)

needs to be considred ?Whether existing network

Infra to be used and bidder needs to provide only TOR

switches for NEXT-GEN DW systems?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

126 33 4 Vendor proposed hardware is expected to be

enterprise class, best of breed, tested and stable

release

Enterprise class hardware is not necessarily built using

pure play commodity components. For better

reliability of the hardware, it is requested to ask for fit

for purpose hardware which will help bank achieve the

desired business objectives.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

127 33 5 The proposed architecture considers vertical and

horizontal scalability as one of the most important

design principles.

Not all hardware required to run the solution will give

both horizontal and vertical scalability. We request

you to please enforce the clause on a wherever

technically possible basis.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

128 33 9 The hardware will be delivered in a staggered

manner and Vendor to provide a plan for the same

While its possible to support the hardware for a

period of 7 years from the date of supply, it might not

always be possible for the same hardware to be

available for sale for a duration of 7 years for

staggered deliveries. Hence it is requested to ask for

backward and forward compatibility of the hardware

such as storage subsytem where both vertical as well

as horizontal scalability requirement is a must.

No change in standard clause of EOI

129 34 18 The proposed hardware is mission critical for the

proposed project and support of 24 X 7 with an

uptime of 99.99 % to be ensured by providing

support at PR, and DR site for a period of 5 years.

It is requested to ask for 6 Hours "Call to Resolution"

time in event of hardware failure for better uptime.

Also, please ask vendors how would they plan to

achieve this requirement

No change in standard clause of EOI

130 35 Critical Functional Requirements:

Hardware Specifications

The Hardware solution must be compatible to

integrate with various systems in the Bank

including but not limited to SOC, PIMS, NOC,

Command Centre, ITAM, Service Desk, ADS, and

SSO etc. at no extra cost. Vendor will have to give

appropriate support to the Bank during integration

with various components of IT environment.

Need clarity if SBI will provide Service desk/Ticketting

tool and other tools that requires licences.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

131 36 35 The web portal of Business Intelligence tool should

support at-least 25000 concurrent users, scalable

up to 75000 in next 5 years, accessing various

reports generated

Please share additional details on this user base such

as user categories, size of this user population in terms

of number of users and their usage profile. Please also

share additional information on how the estimate of

25000/75000 concurrent has been arrived at. This will

help us understand and size for the requirement.

The usage profile could cover information such as the

number of report/dashboard refreshes or queries each

such user is expected to fire during a working day or if

in a particular part of the working day, there is any

window when higher concurrency is expected for

some specific information, etc. (such as all branch

managers refreshing a dashboard between 9-10 am on

Mondays)

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

132 36 35 The web portal of Business Intelligence tool should

support at-least 25000 concurrent users, scalable

up to 75000 in next 5 years, accessing various

reports generated

Please clarify if the definition of concurrent means

concurrently logged into the BI platform or

concurrently executing a report/dashboard refresh or

ad hoc query. (The latter is a fraction of the former)

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

133 29, 40-43 Data Science Platform with AI/ML

Capabilities (Sl No.24)

Analytics on real-time data in real-time/near real-

time

Which of the Appendix F: Usecases are currently or

expected to be deployed realtime/near realtime?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

134 11 Eligibility Criteria Vendor should have existing Next-Gen Data

Warehouse solutions as mentioned in EOI

Please advise if a vendor will qualify only if they have

delivered all solutions in a single client per annexure B

or solutions delivered across mutiple client in small

piece would be considered

Please refer to Corrigendum.

135 11 Eligibility Criteria Solutions should have been implemented in at least

2 large scale organisation

please define size of large scale organisation. should it

be an indian organisation or an entity outside india

should be okay

Please refer to Corrigendum.

136 General query Does current scope of work covers only domestic

operation or it is for international operation as well

Current scope of work covers both Domestic and International operations of

the State Bank of India (SBI) Group including subsidiaries.

137 3 Schedule of event last date of submission - 22 April 2019 we request for 3 week extension of submission

timeline

No change in timelines of EOI

138 General query Can you provide an IT Org chart ? This information is not required at this stage of EOI.

139 General query Is there a current IT strategy and if yes, can you please

share with us?

This information is not required at this stage of EOI.

140 General query Is there a current IT roadmap/plan and if yes, can you

please share with us?

IT Roadmap of DWH department is already covered in EOI. Hence separate IT

Roadmap is not required.

141 General query Can you share any documentation around High-level

system architecture depicting the current IT

landscape? 32 use cases have been specified in

annexure E. Would it be possible to prioritize these

use cases to allow for an efficient and orderly

evolution of the Next Gen DW?`

This information is not required at this stage of EOI.

Page 9: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

142 General query Can you please specifically indicate any in-

progress/planned projects which may have conflict

with this effort from functionality standpoint? What is

the engagement timeline you envisage based on your

internal IT roadmap strategy, taking into

considerations any big upgrade or software

development conflict you see in near term future

This information is not required at this stage of EOI.

143 General query Can you provide the existing IT security policies? Annexure F gives the Summary of SBI Internal IS Policies.

Detailed information is not required at this stage of EOI.

144 General query Can you please provide a list of all the IT systems,

providing brief description of them?

This information is not required at this stage of EOI.

145 General query Is your landscape currently leveraging any cloud

services, if so on what infrastructure ?

This information is not required at this stage of EOI.

146 General query From the current environment , are there areas that

are working well and proposed to be retained for the

next gen DW with minimal changes?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

147 26-27 Data Encryption & Masking General query Can you please share encryption standards used for

authorizing users

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

148 General query Can you kindly share a list out of downstream

applications if the jobs you have mentioned include

the downstream data processing jobs as well

This information is not required at this stage of EOI.

149 General query The EOI mentions that existing infrastructure

components/ETL jobs/schedulers etc. will need to be

reused. Can the bank provide additional details on the

existing architecture/system components (e.g.,

different platforms, programming languages used,

etc?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

150 13-15 Data Ingestion General query Can you kindly share the current system architecture

comprising of list of no# of sources, including ERP's,

stand-alone bespoke applications, web applications

including, and file systems etc ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

151 18-19 Migration from Existing Setup to

Proposed Solution

General query Can you kindly share the list of software, technology

stack being used across IT landscaped, specifically

around ETL and BI reporting ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

152 18-19 Migration from Existing Setup to

Proposed Solution

General query Please advise any ETL tool that you are currently using

for data integration.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

153 18-19 Migration from Existing Setup to

Proposed Solution

General query Can you advise any reporting platform or tools

currently used for operational, management or

financial reporting?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

154 General query From the current environment , are there areas that

are working well and proposed to be retained for the

next gen DW with minimal changes?

Duplicate Query. Refer to Sr.No. #146

155 General query How many existing applications will need to be

migrated to the new infrastructure? Does the bank

have any preference on the migration approach (e.g.,

big bang vs staggered)

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

156 13-15 Data Ingestion General query Performance benchmarking for the Next Gen DW – are

there any bank specified/minimum standards (e.g.,

latency/throughput requirements)?

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

157 23-24 Data Quality General query The EOI includes data quality as one of the key

requirements. Would the same data quality standards

need to be applied across historical vs operational

data (in most banks, historical data standards are

typically lower). Also, given the magnitude of the data

cleansing effort, would certain use cases/data domains

need to be prioritized (e.g., data required for RBI

reporting)?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

158 13-15 Data Ingestion General query What is your current performance benchmark for

throughput while sourcing data ?

This information is not required at this stage of EOI.

159 40 Functional Use Cases General query 32 use cases have been specified in annexure E. Would

it be possible to prioritize these use cases to allow for

an efficient and orderly evolution of the Next Gen

DW?`

This information is not required at this stage of EOI.

160 13-15 Data Ingestion General query What is current process for fixing rejections, data

quality issues and data anomalies ?

This information is not required at this stage of EOI.

161 19-20 Disaster Recovery/Data Archival and

Backup

General query Would you kindly share the current SLA for RTO ? This information is not required at this stage of EOI.

162 19-20 Disaster Recovery/Data Archival and

Backup

General query Can you kindly share any current version control tool

s/w exclusively being used across systems and the

protocol followed for RTO ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

163 19-20 Disaster Recovery/Data Archival and

Backup

General query Can you kindly share data retention period for the

mentioned systems per Annexure G

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

164 23-24 Data Quality General query Can you share the existing data quality tool and

specific features used on daily basis ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

165 23-24 Data Quality General query What is the current process of fixing data quality

issues ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

166 23-24 Data Quality General query Can you please proivde details around profiling being

used , i.e is profiling done across all data or

master/dimenaional data only ? No# of such entities

profiling is based on as of today ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

167 23-24 Data Quality General query What is the SLA for data correction and the process as

it exists in today's world ?

This information is not required at this stage of EOI.

168 24 Data Reconciliation General query Is data reconciliation to be managed at report level for

at data base level ? And consequent thereof, would

there be a worklfow process required ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

169 26-27 Data Encryption & Masking General query Are there any BFSI rule-set readily available which the

client is currently aligned to in context of PII ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the

Page 10: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

170 26-27 Data Encryption & Masking General query Are there any such rule-set expecting updations which

will impact the efffort in consideration ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

171 30 Business Intelligence Tools General query Can you provide list of current KPIs getting

monitored? Can you provide break of

dashbaords/reports by functional area -- e.g. Reg

reporting, Customer aquistiion, customer retention etc

Please refer to requirements given in the EOI.

172 30 Business Intelligence Tools General query Can you please share the total no# of reports, No# of

dashboard, no# of KPI's tracked and reported as part

of regulatory reporting ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

173 30 Business Intelligence Tools General query What are the other reporting solution needs for the

client and the expected fucntionality thereof ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

174 30 Business Intelligence Tools General query What is the scope/tenacity of analytics planned to be

used based on use cases shared ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

175 General query Can you share any or all software licenses currently

used for the existing Data ware house solution

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

176 General query Can you share any or all software licenses currently

used from upstream and downstream standpoint and

your preference thereof to related product suite

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

177 General query How open are you to provide VPN connectivity for

people logging from outside your office working on

development effort?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

178 18-19 Migration from Existing Setup to

Proposed Solution

General query Can you specify how many

master/transactions/dimension/facts/aggregate you

have in your ODS/DWH

Please refer to Annexure D.

Other details are not required at this stage.

179 30 Business Intelligence Tools General query How many dashboards do you currently use and total

no# of reports relaetd to dashboards?

This information is not required at this stage of EOI.

180 17 Data Processing Framework General query What is the key ask for AI/ML technologies other than

performance, can you please leaborate upon them?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

181 23-24 Data Quality General query What are the key issues faced as part of data

governance which you would like to address on

priorty?

Please refer to Data Governance section of EOI on page #21

182 Pg 15 3 A multi-temperature data management solution to

be proposed by vendor where data that is

frequently accessed on fast storage—hot

data—compared to less-frequently accessed data

stored on slightly slower storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should also be capable

automated storage tiering and seamless data

transfer between hot, warm and cold storage. Data

residing in any of these storage areas must be

seamlessly mixed / merged according to

requirements without impacting performance.

1. What is the criteria based on which movement of

data needs to be defined.

2. Duration of data residing in each tier (hot /warm

and cold ) should be defined.

3. What is the amount of daily data data (new data)

coming in storage.

4. Expected Response time from storage system

should be defined

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

183 PG 16 4 Storage replication (e.g. RAID) should be

automatically managed by the platform.

Synchronous or asynchoronous replication should be

defined and RAID technonology should be removed.

No change in standard clause of EOI

184 Pg 16 5 Tool should have capability to store/swap data in

memory, disk and distributed storage areas

depending on the age of the data determined by its

usages through user queries.

Memory should be removed. Swapping can happen at

storage/ Disk level.

No change in standard clause of EOI

185 Pg 16 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Storage performance metrics IOPs/ throughput and

resposne time should be defined.

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

186 Pg 19 4 The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

SLA for RPO needs to be defined along with RTO Bidders to provide their best RTO/RPO for solution (s) proposed.

187 General query Is Deloitte expected to procure all hardware, tools,

links etc.?

This information is not required at this stage of EOI.

188 General query Is solution will be deployed at SBI DC and DR? Yes, solution will be deployed at SBI DC and DR

189 13 2 Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Is the NOSQL data mentioned here JSON/XML ? What

kind of processing on NOSQL is expected ? Is there a

need to join the NOSQL data with other relational data

? Is there a need to shred the NOSQL data into

relational data ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 11: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

190 14 11 Data Ingestion

One of the most important feature is the richness

of the transformations to do day-to-day tasks, such

as;

Data conversion, lookup, expression, joining

records, splitting data, filtering, ranking, sorting,

grouping, looping, and combining data,

pivot/unpivot, converting dates, setting variables

based on parameter files, merging rows, finding the

latest file, and splitting data based on certain

conditions, running web methods, transforming

XML documents, rebuilding indexes, sending

emails, profiling data, handling arrays and records,

processing unstructured data, masking, monitoring

the inbound data flow for completeness,

consistency and accuracy, wizards to assist creating

complex packages, like loading fact tables, or type

two slowly changing dimensions (SCD – T2)

Are any tools / licenses already available with SBI for

ingesting data used in the existing DW solution?

Please provide a list. Any existing tools that align with

the new solution can be considered for reuse if found

to be a good fit.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

191 15 1 Data Storage

Vendor should propose effective number of data

storage layers in NEXT-GEN DW between data

ingestion and data consumption.

In the existing DW surrogate keys used?

If yes, is there a framework for storage and

management of the keys to ensure robustness of the

data warehouse?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

192 16 11 Data Storage

The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Will all these queries be executing in-flight at the

same time ? Or will these be initiated over a period of

time and the aggregate number of queries run during

that time period will be 600,000 ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

193 16 13 It should be possible to project and view data

through multiple modes using the Storage on NEXT-

GEN DW. Varieties of GUIs should be available to

project or view the output generated through

analytic processes. For instance: The Bank may

decide to implement use-cases that project

transactions data as a graph data structure. The

Storage solution on NEXT-GEN DW should allow for

such projections.

Do we need to enable graph based analytics as well or

it should be only limited to the facility to access the

data for graph based analytics.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

194 17 1 For the data to be accessible and consumable by

businesses / downstream applications, the NEXT-

GEN DW should have robust, highly efficient and

parallel execution of data transformation jobs.

We can enable JDBC/ODBC or REST API based access.

Will there be any specific mechanism to connect like

specific type of drivers needed for connectivity with

SAS system etc.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

195 17 2 Data Processing Framework

The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the expected latency requirements for real

time processing? The solution may based on use case

if requirement is for immediate processing vs latency

of upto 5-10 minutes.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

196 17 3 Framework should allow joining multiple

sources/tables/inputs etc.

The source here means the data ingested from the

multiple source systems and present on NEXT-GEN

DW platform not with the data at the source systems.

Yes

197 17 5 Framework should be capable of performing

validation checks pre-and post- processing.

What will be the outcome of validation? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

198 17 8 Data Processing Framework

Should have audit and error logs for auditing and

troubleshooting

Is there is requirement to maintain row-level

traceability of the data records, i.e. from the data

consumption layer backwards up to the originating

source file for a particular record?

Yes

199 17 9 Automatic recovery of data after failure/rejection

of record needs to happen without any manual

intervention

Is there any specific treatment that needs to be

performed for rejected records?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

200 17 11 Framework should have mechanism to protect data

at rest and at motion from unauthorized user

access and amendments.

Is it required to encrypt data at rest and in motion? Is

there a data masking tool available with the bank. Is

there a need for data masking

Yes, there is a need of encryption and data masking for data at rest & in

motion. Bidder is free to propose suitable tools/solution for this.

201 17 14 Data Processing Framework

Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real time mode every day.

Are bulk of the data transformation jobs expected to

be triggered during non-business hours, when user

reporting or other workloads are at a minimum? Are

there going to be users across multiple timezones or

large part of the user base will be within a single

timezone?

This information is not required at this stage of EOI.

202 17 17  The processing pipelines for ETL/ELT jobs also

include real time, daily, weekly, monthly, quarterly

and annual reports, feeding data structures for

downstream consumption. These activities are in-

scope for this engagement.

How many such reports are there and what is the data

model for them this is to estimate the number of ETL

required for the end system.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Hence Bidders

are free to propose suitable solution to meet the requirements of this EOI.

203 17 19 The workflows should work with standard

schedulers. Monitoring and management of

workflows should be possible from an easy to use

interface. Workflow management tool(s) should

have connectors / pluggable interfaces to already

existing / in-use proprietary software available with

the Bank. These could be (and not restricted to)

data repositories, reporting tools, data analysis

tools and generic interfaces for data transfer.

Scheduled jobs status should be made available to

the Bank in Monitoring dashboard on real time

basis.

What are the supported mechanism with proprietary

tool? Does that tool support REST based integration?

Which scheduling and monitoring tool does bank

have?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

Page 12: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

204 18 1 Migration from Existing Setup to Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

Integration of real time data with the data on the

NEXT GEN DW is possible but does this requirement

mean to integrate with multiple source systems? Do

we get the access to the source systems directly.

This information is not required at this stage of EOI.

205 18 1 Migration from Existing Setup to Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

What is the defined reconciliation mechanism is it a

point in time based? Because all the system will be at

have different data based on the time a execution of

ETL job frequency.

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

206 18 2 Migration from Existing Setup to Proposed Solution

Data migration from Staging and Data Marts, user

tables and any other schemas identified by Bank.

Along with the Staging and Data Mart objects, is there

any integrated data layer in the existing solution? If

yes, then is it built using any proprietary data model?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

207 29 7 Data Science Platform with AI/ML Capabilities

In-memory computing & integration with Spark,

Redis, etc

What kind of analytic processing is expected on Spark,

Redis, etc. ? Is this using Spark-ML, for example ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

208 29 18 Data Science Platform with AI/ML Capabilities

All machine-learning platforms either support

multiple models out of the box or provide an

option to custom-code the same

What kind of use cases for ML are expected so as to

understand need for existing out of the box vs. custom

solutions ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

209 29 19 Data Science Platform with AI/ML Capabilities

Integration with R, Python, Keras,

Tensorflow,Theano, scikit-learn etc and other

frameworks / languages

Is there a need to connect with any other analytic

engines to be run in an parallelized/distributed

manner on the system ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

210 29 24 Data Science Platform with AI/ML Capabilities

Annexure E gives sample use cases which are to be

implemented on Next Gen Data Warehouse using

structured and/or unstructured and/or semi-

structured and/or any other kind of data gathered

from either Data Warehouse or Data Lake or Data

Virtualization or all together or any other source.

Is there a need to read data directly from a low cost

storage system and do complex analysis/queries

involving multi-table joins with curated relational data,

using SQL/analytic functions which requires

performance and scalability ?

Yes

211 35 33 Hardware Specifications

Next-Gen DW should support at-least 500

concurrent users, scalable up to 1000 users in next

5 years, running ETL/ELT jobs or doing ad-hoc data

extraction requests on database (Not including API

based access or scheduled job connections to

database)

What is the nature and expected concurrency of API

based access ? What is nature and concurrency of

scheduled job connections - are these ETL, or

maintenance related connections ?

Will the 1000 concurrent users be expected to be

running queris simultaneously, or is this just 1000

concurrent logons ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

212 36 37 Hardware Specifications

Ad-hoc jobs of any complexity should not hamper

the scheduled jobs performance.

What is expected mix of queries in terms of tactical

(very short), medium, long running (reports), batch

loads, near real-time and real-time loads running

simultaneously on the system ?

This information is not required at this stage of EOI.

213 18

Migration from Existing Setup to

Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

1. How frequent are changes to the code ?

2. What is the defined process to capture the change

requests?

3. What version control tool is currently being used ?

4. What is their current release management process?

5. What is definition of Minimum Parallel Run ?

-6. Can the work be done from Teradata offshore

locations or has to be done from onsite?

7. What is the current Testing Strategy ? Is there any

document?

8. Is there any Performance related expectations?

9. What is the inventory numbers of the objects ( if

possible by complexity ) of the existing DWH which

needs to be migrated?

10. Can the legacy code be shared for pattern analysis

? ( If not complete code base, then can a sample be

shared ?)

11. What kind of Test Automation Tools are used

currently?

12. What is the typical availability of customer SMEs

for UAT ?

13. Does the Customer have the business

reconciliation queries? In other words, How do they

verify the data loaded in the current environment?

14. Does the customer have an existing Test/QA

environment ?

15. What kind of documentation is required as a part

of deliverables?

1. This information is not required at this stage

2. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

3. Currently we are using IBM Stack. Bidders are free to propose solution(s)

in the best interest of the Bank to meet the requirements given in the EOI.

4. This information is not required at this stage

5. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

6. Onsite only

7. This information is not required at this stage

8. Bidder to provide details of performance benchmarking to enable us to

take a holistic and comprehensive view of the architecture in formulating

next course of action

9. This information is not required at this stage of EOI.

10. No

11. This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

12. This information is not required at this stage of EOI.

13. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

14. Yes

15. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

214 18

Migration from Existing Setup to

Proposed Solution

Data migration from Staging and Data Marts, user

tables and any other schemas identified by Bank.

1. How many more data marts are there other than 4?

2. Are these data marts built on different databases?

3. Does their DWH comprise of multiple data marts

only or they have an Integrated EDW and model

already in place?

4. Is there any Subject Area Priority (logical split of the

Next Gen DW) & anticipated sizing?

5. What are the SLA times of current ETL jobs?

6. Does the large tables are horizontally partitioned?

1. This information is not required at this stage of EOI.

2. No

3. This information is not required at this stage of EOI.

4. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

5. This information is not required at this stage of EOI.

6. This information is not required at this stage of EOI.

215 18

Migration from Existing Setup to

Proposed Solution

Data migration from existing archival solution to

new one.

1. What is the existing Archival Solution ? Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

Page 13: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

216 18

Migration from Existing Setup to

Proposed Solution

Migration of existing data sourcing ETL jobs. 1. Do they maintain/update Data Mapping Sheet for

ETL /Data Ingestions ?

2. Is the data currently loaded in Batch or Mini-Batch ?

3. Do they have any Design and Coding Standards ?

4. Do they have document on the current ETL

Architecture/Solution, code patterns and their

complexity ?

5. Which Scheduler is being used ?

6. What is the current data volume and how much

data is ingested through different ingestion

mechanisms (batch, real-time etc) ?

7. Do they want to change any current tools ( ETL )

they have ? If yes, then what would be the tool stack?

8. Do they currently have an ETL Control Framework

implemented?

1. This information is not required at this stage of EOI.

2. Yes

3. This information is not required at this stage of EOI.

4. This information is not required at this stage of EOI.

5. Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

6. Please refer Annexure C & D

7. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

8. Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

217 19

Migration from Existing Setup to

Proposed Solution

Migration of monitoring dashboard data points 1. Is it to show the progress of migration underway?

2. Is there a need to build a migration framework for

future data migration ?

1. Yes, migration progress can be shown on monitoring dashboard along with

that Bidder is expected to migrate existing monitoring dashboard to new

setup

2. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

218 19

Migration from Existing Setup to

Proposed Solution

Migration of Data Governance, Data Lineage and

Data Quality rules and policies

1. What are the existing Data Governance, Data

Lineage and Data Quality rules and policies ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

219 19

Migration from Existing Setup to

Proposed Solution

Migration of All the remaining components of

existing ecosystem (Mentioned in Annexure - D) as

and when identified by Bank like job scheduler,

reports, history of version control, existing tape

backup, etc.

1. Will the access be provided to all systems - what

would be the constraints ?

This information is not required at this stage of EOI.

220 19

Migration from Existing Setup to

Proposed Solution

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

1. What are the source systems (ERP, CRM etc.) does

the current DWH have ?

2. What is the current downtime schedule for their

existing DWH ?

1. This information is not required at this stage of EOI.

2. This information is not required at this stage of EOI.

221 19

Migration from Existing Setup to

Proposed Solution

Vendor to review the existing architecture during

migration and remove duplication of data and

recommend improvements in overall setup if any

1. What are the deduplication rules ? Do they

currently have any defined?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

222 19

Migration from Existing Setup to

Proposed Solution

Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years in staggered manner during the

implementation of Next-Gen DW which will save

cost to the Bank. (Annexure D gives the technology

architecture of the current setup)

1. Need more detailed information for their existing

Infrastructure and eco-system.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

223 13 Scope of Work Team structure (without actual profiles) Does the bank anticipate OEM involvement in

implementing core OEM related services via the

Systems Integrator

Yes, please refer to point number # 16, under Hardware Specification on page

number # 34 in EOI for more details.

224 13 Critical Functional Requirements -

Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Ratio of split of Structured:Semi-Structured;

Unstructured (Images): unstructured (Videos) data.

This helps in solutioning taking into consideration

ground realities

Please refer to Annexure G

225 25 Critical Functional Requirements -

Elements based Reporting

Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Please elaborate with an example or 2 explaining

elements based Report to have a common

undersanding

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

226 42 Annexure E- Functional Use Cases

(Risk Area)

General Would the bank expect Graph Analytics capabilities in

the solution for better network analytics which helps

determine the betweeness and strength of network

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

227 25, 39 Critical Functional Requirements -

Regulatory Reporting

Automation –Tool should automate analytics and

reporting workflow end-to-end, including all data

collection, enrichment, and management, as well as

all calculations, processes to final report

submission. Currently 500+ jobs are being used for

Tranche 1 DCT generation along with 500 more for

other regulatory reports/returns.

Please provide the number of returns/reports for

regulatory body over and above the ones listed in

Annexure E (Sno 4)

This information is not required at this stage of EOI.

228 General General General By when is the RFP expected and by when is the bank

expecting to conclude this.

The reason for this question is, if the contract of

existing Datawarehouse ecosystem with existing

vendor is nearing completion, then this will have a

direct bearing on migration strategy as well as

extension of licencing till such time the migration from

existing to new system takes place (may take few

months)

This information is not required at this stage of EOI.

229 General General General We understand that SBI has an MDM solution, hence

the data quality issues, duplication must be under

control. What DQ tools and quality is expected for the

EDW

Please refer to Data Quality section of EOI on page #23

Page 14: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

230 11 Eligibility Criteria Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Request bank to change the clause as Vendor/OEM

should have existing Next-Gen Data Warehouse

solution as mentioned in the EOI

or

keep the same eligibility criteria as per last tender

SBI/GITC/IDSPM/2017/2018/471 Dated: 18/03/2018

i.e - Bidder should have supplied & implemented at

least 3 orders of enterprise class x86 servers and

storage in India within the last 4 years. Minimum cost

of one single order value for x86 servers and storage

supplied in India should be INR 15 Crore in value.

Please refer to Corrigendum.

231 46 Annexure G- Next Gen Data Ware

House Sizing

Next Gen Data Ware House Sizing Specified

separatly for Data Ware House and Data Lake

Is Bank is looking for separate solution for Data Ware

House and data Lake . If Yes , Hope Structured data

only will be in DWH and semi-

structured/unstructureddata will be in Data Lake.

Request Bank to confirm this.

As clearly mentioned in the EOI requirements, Bank is looking for a integrated

solution having both Data Warehouse and Data Lake components.

Point #21 of Data Ingestion

Bidder should propose which technology is suitable for each kind of upstream

data ingestion like Data Warehouse, Data Marts, Data Lake, Use Data

Virtualization/Federation layer, etc.

232 16 6 User should be able to work on DB even while

backup is in progress. They should be able to run

statistics and reorganize their tables. Any

background process including backup must not

hamper performance of user queries.

We understand this requirement is for Online Backup

& Database level Archival as well

Yes

233 19 9 Migration of All the remaining components of

existing ecosystem (Mentioned in Annexure - D) as

and when identified by Bank like job scheduler,

reports, history of version control, existing tape

backup, etc.

Kindly confirm if existing data needs to be migrated to

proposed Backup Software. Please confirm existing

backup software details.

Yes, Currently we are using IBM Stack. Bidders are free to propose solution(s)

in the best interest of the Bank to meet the requirements given in the EOI.

234 18 3 Data migration from existing archival solution to

new one.

Kindly confirm details of existing Archival Software Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

235 18 3 Data migration from existing archival solution to

new one.

It is advisable to bring the Archival Data to original

production before migartion. Kindly add the same

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

236 33 6 Vendor to propose hardware specifications for each

component of Next-Gen DW ecosystem like Data

Warehouse, Data Marts, Data Lake, Data Archival,

Data Federation/Virtualization, Data Science

Platform, Backup, Sandboxes, Functional DR, etc.

for PROD, DEV and UAT environment as applicable

As Backup Software needs to be prposed here, kindly

confirm what features needs to be proposed for

backup software , like deduplication, Compression,

Encrption, Backup Data Replication, Bare Metal

Recovery , harware level application aware snapshot

backup etc

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

237 33 6 Vendor to propose hardware specifications for each

component of Next-Gen DW ecosystem like Data

Warehouse, Data Marts, Data Lake, Data Archival,

Data Federation/Virtualization, Data Science

Platform, Backup, Sandboxes, Functional DR, etc.

for PROD, DEV and UAT environment as applicable

Kindly confirm , Backup & Archival Solution needs to

be proposed for all applications ie Data Warehouse,

Data Marts, Data Lake, Data Archival, Data

Federation/Virtualization, Data Science Platform

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

238 34 16 Installation and Configuration of Storage and

Backup equipment with Hot, warm and Cold data

segregation

Please confirm if this will be "Disk to Disk to Tape"

backup at DC & DR .

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

239 20 1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

As we undertand , The Archival Solution is required for

File System & Database level Archival

Yes

240 20 3 All the applications connected to the non-archived

data should be available with archived as well

Kindly elaborate the expectations Access to archival solution is expected to be similar to production setup

241 20 5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Kindy confirm no. of tapes ? We understand, these

tapes data needs to be integrated to newly proposed

backup software

Yes, Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

242 14 7 Existing ETL Jobs to be Fine Tuned What kinds of ELT tools are being used currently? If

multiple tools are used, among current 3000+ jobs,

how many for different tools approximately?

Duplicate Query. Refer to Sr.No. #7

243 14 13 Trigger mechanisms in identifying any structural

changes at source

Can you help clarify what the structural changes are? Duplicate Query. Refer to Sr.No. #8

244 14 14 The ingestion tools should be able to perform a

change data capture on source systems of this

nature with run time decompression functionality

If the change data capture is in use now, what's the

vendor product? How many jobs are running on

different CDC tools?

Duplicate Query. Refer to Sr.No. #9

245 15 16 scrapping 4000-5000 logs daily having log size of ~

2 TB each scalable up to 10000 logs.

How many source applications are involved

approximately?

Duplicate Query. Refer to Sr.No. #10

246 15 17 Solution should be able to handle DDL change

without manual reorg/runstat.

Could you please provide use case about DDL change? Duplicate Query. Refer to Sr.No. #11

247 15 18 A job scheduler, along with process management

controls that provide things like runtime

monitoring and error alerting, handling, and

logging.

Is there any job scheduler used currently? If yes,

what's it? If more that one, please list as well.

Duplicate Query. Refer to Sr.No. #12

248 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot

How many downstream departments might be

engaged for DB snapshot approximately? Are they

branch IT or business department?

Duplicate Query. Refer to Sr.No. #13

249 17 12 ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Could you help elaborate some use case of AI/ML

featuer? Do you know any ETL tool with AI/ML

featuer?

Duplicate Query. Refer to Sr.No. #14

250 18 16 Reports on job status and success / failure /

retrigger should be sent to concerned stakeholders

on a continuous basis

Through which channels reports are sent out, email,

SMS, or some real-time monitoring dashboard? Please

help clarify.

Duplicate Query. Refer to Sr.No. #15

Page 15: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

251 18 19 Workflow management tool(s) should have

connectors / pluggable interfaces to already

existing / in-use proprietary software available with

the Bank. These could be (and not restricted to)

data repositories, reporting tools, data analysis

tools and generic interfaces for data transfer.

What are these exact products of vendors of 'data

repositories, reporting tools, data analysis tools and

generic interfaces for data transfer'?

Duplicate Query. Refer to Sr.No. #16

252 18 3 Data migration from existing archival solution to

new one.

What's the current existing archival solution? Archived

to tape lib ? Please help clarify.

Duplicate Query. Refer to Sr.No. #17

253 19 6 Migration of monitoring dashboard data points. What is current dashboard product being used now?

It's an enterprise-level dashborad and will continue to

use, isn't it?

Duplicate Query. Refer to Sr.No. #18

254 19 8 Migration of Data Governance, Data Lineage and

Data Quality rules and policies

Is there any automation tool used for governance,

lineage, quality etc? What are they?

Duplicate Query. Refer to Sr.No. #19

255 19 9 history of version control, existing tape backup What does it mean of 'version control'? It's historical

version of programme / storedproc or a version

control tool?

Duplicate Query. Refer to Sr.No. #20

256 19 9 history of version control, existing tape backup Does it mean 'tape backup' migrates to data lake

alike? Will the historical tape backup be part of the

migration scope or leave the historical as it is, just take

care of these nowards?

Duplicate Query. Refer to Sr.No. #21

257 19 12 Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years

Could you please share the software product stack and

proprietary or open hardware specifics of existing

architecture?

Duplicate Query. Refer to Sr.No. #22

258 19 3 The NEXT-GEN DW – DR solution needs to be set

up at a remote location at Hyderabad.

What's the distance between PROD site and DR site?

What's the network bandwidth between?

Duplicate Query. Refer to Sr.No. #23

259 20 3 The architecture of NEXT-GEN DW should enable

interaction between public cloud and designated

edge servers alone.

Could you please elaborate with some detailed use

case?

Duplicate Query. Refer to Sr.No. #24

260 21 16 Back Dated Data changes needs to be updated on

portal

Could you please clarify with some case of Back Dated

Data change?

Duplicate Query. Refer to Sr.No. #25

261 22 3 Management of referential integrity Could you please clarify with some case of referential

integrity?

Duplicate Query. Refer to Sr.No. #26

262 23 7 Recommend Enrichment — Enhancing the value of

internally held data by appending related attributes

from external sources

Could you please give an exmaple? Duplicate Query. Refer to Sr.No. #27

263 23 10 Identity resolution - Identity resolution is the

process of linking various records and is the main

engine for record de-duplication, which can enable

some aspects of data cleansing.

Could you please give an exmaple? Duplicate Query. Refer to Sr.No. #28

264 24 1 Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups.

Are you looking for a standalone AIM specific for this

NEXT-GEN DW solution? Or there is already some

centralized AIM at enterprise-level, this new-

introduced AIM of this solution is required to integrate

or sync with this enterprise-level one? Please help

clarify.

Duplicate Query. Refer to Sr.No. #29

265 25 1 Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Does it mean a new regulatory reporting

framework/application is requested to replace the

existing one? Or just keep this regulatory reporting

framework/application but migrate the datastore to

NEXT-GEN DW? What is the current framework, from

which product vendor?

Duplicate Query. Refer to Sr.No. #30

266 28 5 Self-service portal to extract the data on their own

(Should support Data Democratization)

Is there any approval procedure, evaluation of

sharability, or masking/transformation prior to

extraction self-service and any data destory procedure

afterwards? Who are the main requestor and waht's

the frequency of request? Please help clarify.

Duplicate Query. Refer to Sr.No. #31

267 28 1 Implementing end to end analytics use-cases as

mandated by the Bank

Could you list all the analytics use-case or some typical

case?

Duplicate Query. Refer to Sr.No. #32

268 28 2 Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS).

Please advise the exact version of IBM SPSS Duplicate Query. Refer to Sr.No. #33

269 30 1 Capability to connect to various data sources Is there any BI tool being used at this moment? If yes,

what is it and the version?

Duplicate Query. Refer to Sr.No. #34

270 39 1 Database appliance what is the product of the existing applicance? E.g.

Teradata, Exadata?

Duplicate Query. Refer to Sr.No. #35

271 39 3 Scrapping of Jobs from Source What kind of scrapping tools are used currently? In

this solution, is there any preference to reuse these

tools?

Duplicate Query. Refer to Sr.No. #36

272 39 3 Scrapping of daily 5TB + logs What kind of logs here? Are they database logs? If yes,

what are all these database products, e.g. Oracle, DB2,

SQLServer, MySQL etc?

Duplicate Query. Refer to Sr.No. #37

273 39 4 Reporting Any reporting tool is used? E.g. Tableau etc. Duplicate Query. Refer to Sr.No. #38

274 39 6 Hardware Monitoring What is the Monitoring product and version? Duplicate Query. Refer to Sr.No. #39

275 39 6 Server memory and compute monitoring of total

80+ production servers

What type of these production servers, including

applicance or just commondity x86 servers? What

typical spec are they? How many years have they been

used?

Duplicate Query. Refer to Sr.No. #40

276 39 7 User database of 30000+ officials How many different types of these users? How many

data analysts, which can write and perform ad hoc SQL

query?

Duplicate Query. Refer to Sr.No. #41

277 39 8 Job Scheduling What is the current scheduling tool being used

currently?

Duplicate Query. Refer to Sr.No. #42

278 39 18 10TB Oracle database What version is used? Duplicate Query. Refer to Sr.No. #43

279 48 SCAPM IBM Smart Cloud What kind of applications are running on that? What's

the interaction with NEXT-GEN DW solution?

Duplicate Query. Refer to Sr.No. #44

280 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on

BI/ETL/Database Level. User concurreny at

BI/ETL/Database. Also what is type/make/core

information of ETL/DWH/ODS/DM Servers.

Refer to Hardware Specifications section starting on page # 32

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

281 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please

provide backup and other configuration details .

Existing backup are disk based /tape based/frequency

of backups

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

282 19 Disaster Recovery

clause 1

Bank proposes to setup only functional DR to start

with. At later stage Bank may take decision to

setup full scale 100% DR.

Please elaborate on functional DR in terms of PROD

capacity . Is DR being looked from Day 1?

Please refer to subsection Disaster Recovery on page no #19 of EOI document

Page 16: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

283 19 DR Clause 4 The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

Since the volumes involved are large , the bandwidth

capacity and the time slots for replication will be

provided by the Bank. Please clarify

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

284 19 DR Clause 6 The proposed solution is expected to have a

monitoring engine that can determine the health of

production NEXT-GEN DW and raise alerts / trigger

remedial actions to bring NEXT-GEN DW – DR as

the default NEXT-GEN DW

Bank wishes to have an automation tool for the same .

Please clarify.

Please refer to subsection Disaster Recovery on page no #19 of EOI document

285 32 HW Specs clause 10 Vendor must provide detailed configuration of the

proposed Hardware, including Hosting Space

Requirements, Racks, Power, Cooling and any other

requirement for the fulfillment of the Vendor’s

obligation in this EOI.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

High level/approximate indication would suffice

286 35 HW Specs Clause 30 Vendor is required to provide the minimum

resources to monitor & manage the infrastructure,

however it is the Vendor’s responsibility to right

size the resources to meet the SLA

Clarification needed on Bank's expectations on

number of resources. Can you please explicitly

mention the SLA requirements.

This information is not required at this stage of EOI.

287 36 HW Specs Clause 44 Vendor need to propose a solution for data

migration / transfer between Existing DWH (Navi

Mumbai Location 1) and NEXT-GEN DW-PR (Navi

Mumbai Location) and also between NEXT-GEN DW-

PR (Navi Mumbai Location 2) and Hyderabad (DR)

or any other places for PR and DR decided by the

Bank.

Please share details of locations , approx distances ,

bandwidth capacity to be provided by Bank. Please

clarify.

This information is not required at this stage of EOI.

288 36 HW Specs Clause 48 The vendor should provide EXACT size needed for

production in the 1st year and estimated sizes for

consecutive years keeping in view the growth rate

predicted by Bank in this section and provide

empirical evidence for the calculation of growth

rate.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

High level/approximate indication would suffice

289 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of

existing EDW & 2. Breakup of tiers among the

80+ production servers.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

290 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Please refer Annexure G for sizing.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

291 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA

implementation for these use cases . Alternatively willl

the OFSAA feed from the DWH. Please let us know the

interplay between DWH and OFSAA.

This information is not relevant to EOI.

292 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a

complement to existing data solutions at the bank

(e.g. OFSAA) ?

This information is not relevant to EOI.

293 40 Annexure E Functional Use Cases - role  of Next-Gen-DW in mission critical operational

processes e.g. daily RBI and regulator reporting,

related time financial crime detection, etc.

This information is not relevant to EOI.

294 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH

Solution. Information including but not limited to:

DWH platform, model, version and Size

Details of CPUs, Memory, OS, Database version

Details of Storage configuration: Size, capacity, free

and used space etc

Whether there are any physical or logical isolation of

DWH setup

HA, Back, DR details

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

295 13 Annexure B Performance Benchmark of Next Gen DWH Oracle provides Industry standard globally acceptable

metrices to measure Database performance like IOPS,

throughput and Load rate. We believe this addresses

the requirement

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

296 16 4 Storage replication (e.g. RAID) should be

automatically managed by the platform.

We believe this is a RAID level if not then please

elaborate more

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

297 16 8 Storage should support data compression. It should

be possible to perform both fast compression and

efficient compression based on data processing

needs.

We recommended Bank to additionally ask for Storage

and DWH systme capable of providing Coumnar based

as well row based compression. And data should be

readable without decompression

No change in standard clause of EOI. Bidders are free to propose solution(s)

in the best interest of the Bank to meet the requirements given in the EOI.

298 16 9 The storage should be horizontally and vertically

scalable. Redistribution of data across the NEXT-

GEN DW should be possible automatically and

seamlessly.

We recommended Bank to additionally ask for

"Storage upgrade and cpacity increase should be done

without dontime

No change in standard clause of EOI. Bidders are free to propose solution(s)

in the best interest of the Bank to meet the requirements given in the EOI.

299 16 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

1. Please provide details of queries and ETL jobs

2. Please provide ratio of Select/DML queries and ETL

jobs

3. Share the complexity of query- simple, medium,

complex Queries

4. Also provide Query defination of Simple, Medium

and Complex queries

This information is not required at this stage of EOI.

300 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases. Construction of this

separate Database snapshot and enabling this audit

trails must not cause any major systemic

issues/challenges in smooth functioning of primary

DB.

Please provide rationale and logic to have separate

processing and storage for this. Powerful DHW

systems are today capable of servicing all consumer

groups in parallel

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

301 17 12 ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Please clarify further with example This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

Page 17: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

302 17 13 Existing reports and extracts generation jobs on

DWH should be analyzed and transformed to the

NEXT-GEN DW. The vendor should use preferably

off-the- shelf tools and not resort to building from

scratch.

Please share sample reports and extracts to propose

most suitable options for migration

This information is not required at this stage of EOI.

303 17 Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real

plz. Define complexity - simple, medium, complex

Share the split % of simple, medium, complex

Plz. Share sample jobs for each type

This information is not required at this stage of EOI.

304 27 User Management: Pt4-The access privileges

associated with each system product, e.g.

operating system, network, database, application

and system utilities, and the users to which these

privileges need to be allocated should be clearly

identified and documented.

Should we assume that the access privileges are to be

assigned to the users directly and managing access to

these privileged accounts is not required?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

305 39 User database of 30000+ officials Should we assume approx 30k users would access the

solution with a YoY increase of 10%?

Refer to Hardware Specifications section starting on page # 32

306 30 6 Reporting on all types of available of Data Formats;

· Structured, semi-structured, unstructured

· Click stream data

· Audit Logs

· Documents

· Multimedia data (Images/Videos/Audios)

· XBRL format

· IRIS iFILE framework

Please clarify on the type of database used for the

each of data formats asked for. Or Is it safe to assume

the underline data will be in as per industry specified

relational data format like Oracle , DB2 etc.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

307 30 5 Visualizations: BI tools must provide below

different types of visualizations;

· Animations, Barcodes

· Bar, line, pie, area and radar chart types

· Tables, Graphs, Infographics, Filters

· Widgets

· Drag and Drop Creation, Customization

· Templates

· Freehand SQL Command

· Geospatial Integration

· Layouts

· Themes

· Ability to mix and match various combinations

Please elaborate on the definition of

-Animations

-Infographic :what all visualization are you referring to

-Widgets, Templates

These are industry standard terms used in Visualization of data.

308 32 22 In-memory analytics: The product should pull data

into an in-memory or locally cached data store

preferably columnar is an increasingly popular

feature that enables very fast analytics once the

data is loaded.

To achieve the fast analytics, BI tool may adopt

different architecture. BI tool can easily leverage the

in-memory benefits of underline database without

pulling and creating data redundancy and henceforth

reducing the data manageability at BI Layer . Request

you to rephrase the this point as

"In-memory analytics: The product should pull data

into an in-memory or leverage the In-Memory

capabilities of underline database Or locally cached

data store preferably columnar is an increasingly

popular feature that enables very fast analytics once

the data is loaded."

No change in standard clause of EOI

309 32 23 Offline updates: BI tools, when storing copies of

the source data in an online analytical processing

(OLAP) cube or in-memory columnar data store,

should enable business users to schedule

automatic data updates.

Different BI tools have different architecture. BI tool

can easily leverage the capabilities of underline

database. Is this point relevant to the BI Tools whose

architecture is to store the data with BI Server.

No change in standard clause of EOI

310 32 28 Speed of access: Query performance will vary based

on the complexity of the queries and the amount of

data involved. Dashboards with multiple

visualizations will need to get query results from

many queries. The best practice is to create several

prebuilt query scenarios and compare how each

product performs based on these specific

examples. The worse practice is to just arbitrarily

rate the speed.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Requirement from BI Tool is a faster speed of access. Bidders are free to

propose solution(s) in the best interest of the Bank to meet this requirement.

311 32 29 The best practice is to establish a testing

environment to determine scalability in terms of

both the number of concurrent users and data

metrics, such as volumes, variety and veracity.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Requirement is to set up a testing environment by following best practice.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet this requirement.

312 32 32 Ability to handle and summarize huge volumes of

data. E.g. 30-40 million rows accessed on index and

summarized over 5 to 8 metrics.

Please elaborate the use case for consumption of 30-

40 million rows from BI. Usually BI tool leverages the

underline database to do summarization of data and

only works on the resulted dataset

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 18: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

313 35 36 The web portal of Business Intelligence tool should

support at-least 25000

concurrent users, scalable up to 75000 in next 5

years, accessing various reports generated

For doing the sizing of Business Intelligence we need

bifurcation of the concurrent users (25000)

Total Concurrent Users : 25000

Number of concurrent active : provide the concurrent

active user count

Number of logged/in-active : provide the

loggedin/active user count

Out of Active Users:

- Users executing BIEE dashboards (having 4/5 reports

or simple charts in a dashboard)

- Users executing large Pivot table operations (25000+

rows)

- Users executing (small to medium sized report - 50K

cells or lower) export to pdf/XL operations

- Users executing very heavy Graphics

Number of Active Concurrent running Extra Large

Reports

(Usually Extra Large Reports are executed off-line

hours)

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

314 15 point # 16 Proposed solution should be able to scrap

encrypted log, capture Metadata

changes at source level completely, scrapping 4000-

5000 logs daily having log

size of ~ 2 TB each scalable up to 10000 logs.

Proposed solution should be

capable of scrapping logs generated by any type of

Database. E.g. Oracle

Database, IBM DB2 Database etc.

Kindly specify if decryption of logs is also required or

only storage of such logs is fine ? If yes, is there

decryption logic available in specified system ?

Yes. Bidder to support the encryption/decryption logic at each source system

315 38 Annexure C -Monthly Data processed

in DWH Warehouse

Archived log extract CBS (SBI) +

TF (SBI)

Are these logs encrypted?

Do we need to keep the RAW logs into the system? Or

only processed logs ?

Yes logs are encrypted, Bidders are free to propose solution(s) in the best

interest of the Bank to meet the requirements given in the EOI.

316 29 # 9 (Data Science Platform with

AI/ML Capabilities)

GPUs to be incorporated in solution if possible

using HDFS Hadoop like

environment for better analytical results

1. Is there requirement to run AI/ML models within

HDFS Hadoop ? Or Expectation is to pull the data into

GPU based analytics workbench and then process.

2. Running AI/ML models within Hadoop is also faster

and Having separate GPU based system for specific AI

models can reduce the cost of GPU based solution.

Please suggest

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

317 15 # 1 - Data Storage A multi-temperature data management solution to

be proposed by vendor where

data that is frequently accessed on fast

storage—hot data—compared to lessfrequently

accessed data stored on slightly slower

storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should

also be capable automated storage tiering and

seamless data transfer between

hot, warm and cold storage. Data residing in any of

these storage areas must be

seamlessly mixed / merged according to

requirements without impacting

performance.

Kindly share the tentative timeline for Hot/Warm/Cold

data so that we could calculate the size. Example: Hot

Data - 6 months, Warm Data - 1 year, Cold data > 1

year etc.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

318 18 18 Transformations for this activity can be categorized

into the following types:

· Existing transformations in DWH that needs to be

migrated to NEXT-GEN DW

Please share the existing transformation details. This information is not required at this stage of EOI.

319 20 1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Please provide Retention period This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

320 20 5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Please provide the details of existing Tape Backup

Solution, Backup Window, Backup Throughput and

Restoration throughput

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

321 13 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Ingestion Sr #2

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Which RDBMS source data is required to be extracted

in real-time mode? Please provide the source system

name, RDMS type (Oracle/ SQLServer etc) and the

underlying OS

This information is not required at this stage of EOI.

322 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #5

Migration of existing data extraction and reporting

jobs.

Since this is across different products, is this expected

to be semi-automated / manual?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

323 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #6

Migration of monitoring dashboard data points. Since this is across different products, is this expected

to be semi-automated / manual?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

324 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #7

Migration of user details. Since this is across different products, is this expected

to be semi-automated / manual?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 19: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

325 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #8

Migration of Data Governance, Data Lineage and

Data Quality rules and policies

Since this is across different products, is this expected

to be semi-automated / manual?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

326 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Will these also be flat files? If not, what will be the

interface mode (RDBMS, webServices, API etc) and

which RDBMS?

Support from Data Ingestion for all possible types to be considered in solution

327 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Please provide a count of data sources (best

approximation, and of these how mant will be flat file

sources?

This information is not required at this stage of EOI.

328 23 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Quality Sr #12

Mechanism to capture feedback from end users to

report Data Quality issues

Please elaborate. Can this be implemented using

enterprise collaboration tooling / ticket maintenance

system?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

329 39 Annexure D - Existing Data

Warehouse Architecture Sr #14

Data Quality What are the existing Data Quality details? How many

and which entity masters are maintained? What is the

current count of each type of Entity and how are their

counts expected to scale up (volumetrics)?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

330 46 Annexure G Annexure G - Next Gen Data Warehouse Sizing What is rational behind the DWH size (592.16TB -

2251.96TB) at Primary and (19.59TB- 75.26TB) at DR

Site.

We have analyzed the data points which should be there in production and

DR. Rational is based on the same.

However, Bidders are expected to propose storage forecast over and above

given sizing in the solution to ensure fast performance of system.

331 11 Annexure A Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Bidder wants to understand if this criteria is applicable

for the bidder or for the platform(OEM) proposed as

part of the solution to the EOI

Please refer to Corrigendum.

332 11 Annexure A The company/firm should be profit making

organization for last 3 years.

Bidder requests this clause to be modified as below :

Bidder should be profitable organization based on

Profit After Tax (PAT) for at least two out of last three

financial years (2015-2016, 2016-2017, 2017-2018)

No change in standard clause of EOI

333 12 Technical Criteria/Scope of Work :

End state objectives 

Log Storage/Archive Solution Bidder wants to understand if Archival solution is

required considering the bank wants to implement

both Primary DC & Secondary DR Site.

Yes, Archival solution is required as mentioned in 'Data Archival and Backup'

sub section on page # 20 of EOI

334 17 Data Processing Framework #14 at least 1500 jobs in parallel, scalable up to 5000 Bidder requests details of job types, ETL, ELT, System

Processes/Query and end user query along with load

type whether simple, medium, complex.

Also the bidder would like to understand whether the

jobs would run in batch or in near real time .

Kindly specify the % of the total number of jobs under

each category (namely Simple, Medium, Complex).

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

335 18 Data Federation/Virtualization Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution

Bidder request details of existing setup(product details

and HW)

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

336 18 Migration from Existing Setup to

Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution

Bidder request details of existing setup (Product

details and HW)

1.DW software and HW configuration of nodes

2.ETL software and HW configuration of nodes ..

3.Data Federation/Virtualization software and HW

configuration of nodes 

4. BI sofware and HW configuration of nodes .

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

337 19 Disaster Recovery Bank proposes to setup only functional DR to start

with. At later stage Bank may take decision to

setup full scale 100% DR.

Bidder request details on location of DR. Bidder also

requests for additional details on the functional DR

requirement E.g.10% of Storage, 50% of ETL, 50% of

BI. 

Storage requirements for functional DR are specified in Annexure G. Bidders

are free to propose solution(s) in the best interest of the Bank to meet the

requirements given in the EOI.

338 20 Data Archival and Backup Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Bidder requests bank to specify the kind of backup

archival required is which of the following

a) Disk to disk

or

b) Disk to Tape.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

339 22 Data Governance #14 Parallel processing: Data governance tool should be

able to handle 500 concurrent users, scalable up to

1000 users in next 5 years, running any kind of job

(eg: Data Lineage on simple/medium/complex jobs

running on multiple tables)

Bidder requests Bank to clarify whether these jobs are

subset of Data Processing Framework #14

No both are completely separate jobs

340 16 Data Storage #6 User should be able to work on DB even while

backup is in progress. They should be able to run

statistics and reorganize their tables. Any

background process including backup must not

hamper performance of user queries.

Bidder understand this requirement is for Online

Backup & Database level Archival as well. Kindly

confirm.

Duplicate Query. Refer to Sr.No. #232

341 19 Migration from Existing Setup to

Proposed Solution #9

Migration of All the remaining components of

existing ecosystem (Mentioned in Annexure - D) as

and when identified by Bank like job scheduler,

reports, history of version control, existing tape

backup, etc.

Kindly confirm if existing data needs to be migrated to

proposed Backup Software. Please confirm existing

backup software details.

Duplicate Query. Refer to Sr.No. #233

342 18 Migration from Existing Setup to

Proposed Solution #3

Data migration from existing archival solution to

new one.

Kindly confirm details of existing Archival Software Duplicate Query. Refer to Sr.No. #234

343 18 Migration from Existing Setup to

Proposed Solution #3

Data migration from existing archival solution to

new one.

It is advisable to bring the Archival Data to original

production before migartion. Kindly add the same

Duplicate Query. Refer to Sr.No. #235

Page 20: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

344 33 Hardware Specifications #6 Vendor to propose hardware specifications for each

component of Next-Gen DW ecosystem like Data

Warehouse, Data Marts, Data Lake, Data Archival,

Data Federation/Virtualization, Data Science

Platform, Backup, Sandboxes, Functional DR, etc.

for PROD, DEV and UAT environment as applicable

As Backup Software needs to be proposed here, kindly

confirm what features needs to be proposed for

backup software , like deduplication, Compression,

Encryption, Backup Data Replication, Bare Metal

Recovery , hardware level application aware snapshot

backup etc

Duplicate Query. Refer to Sr.No. #236

345 33 Hardware Specifications #6 Vendor to propose hardware specifications for each

component of Next-Gen DW ecosystem like Data

Warehouse, Data Marts, Data Lake, Data Archival,

Data Federation/Virtualization, Data Science

Platform, Backup, Sandboxes, Functional DR, etc.

for PROD, DEV and UAT environment as applicable

Bidder requests for confirmation whether Backup &

Archival Solution needs to be proposed for all

applications ie Data Warehouse, Data Marts, Data

Lake, Data Archival, Data Federation/Virtualization,

Data Science Platform

Duplicate Query. Refer to Sr.No. #237

346 34 Hardware Specifications # 16 Installation and Configuration of Storage and

Backup equipment with Hot, warm and Cold data

segregation

Please confirm if this will be "Disk to Disk to Tape"

backup at DC & DR .

Duplicate Query. Refer to Sr.No. #238

347 20 Data Archival and Backup #1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Bidder understands that the Archival Solution is

required for File System & Database level Archival .

Please confirm.

Yes

348 20 Data Archival and Backup #3 All the applications connected to the non-archived

data should be available with archived as well

Request Bank to elaborate further on the scope of

work expected from Bidder under this criteria.

Access to archival solution is expected to be similar to production setup

349 20 Data Archival and Backup #5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Kindy confirm no. of tapes . Bidder understands that

the data in these tapes needs to be integrated to

newly proposed backup software . Kindly confirm on

the scope.

Yes, Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

350 20 Cloud Integration and Migration -

Point 5

In view of the intent to reduce the hardware

footprint (in future), the technical architecture of

NEXT-GEN DW solution should be flexible to

accommodate provisioning of NEXT-GEN DW on

cloud.

IS Bank expecting NEXT-GEN DW solution should be

flexible to accommodate provisioning of NEXT-GEN

DW on Public cloud ? Please confirm

At present, as per the Bank's IS policy migrating/storing data in public cloud is

not permitted. However bidders may propose, as an alternative, use of cloud

(public cloud, private cloud, on-premise etc) in addition to the best integrated

proposed solution.

351 36 Hardware Specifications - Point 45 Database should be linearly scalable which can

expand the database capacity by just adding more

nodes to the existing database.

Is Bank expecting to supply Hyper Converged

Infrastrcture ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

352 37 Hardware Specifications - Point 51 Vendor to submit all back-to-back agreement

copies between Vendor and SI / OEM / Parent

company etc if any and tenure of the back-to-back

agreement should be same as selected Vendor’s

agreement with the Bank

Please eleborate the type of Back-to Back agreement

with OEMs ?

Please refer to Corrigendum.

353 Page 20 Data Archival and Backup Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Please provide the details for Backup Window ? E.g ( 1

TB data has to be back in 1 hour on Tapes.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

354 Page 20 Data Archival and Backup Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Please provide the details for exsiting Archival storage

and retention policy.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

355 Page 33 Point 5 The proposed architecture considers vertical and

horizontal scalability as one of the most important

design principles.

Request to please reconsider vertical scalability

beause after 2-3 years servers processor availibity and

commercial imapct for particular processor will be very

high.

No change in standard clause of EOI

356 Page 33 Point 6 Vendor to propose hardware specifications for each

component of Next-Gen DW ecosystem like Data

Warehouse, Data Marts, Data Lake, Data Archival,

Data Federation/Virtualization, Data Science

Platform, Backup, Sandboxes, Functional DR, etc.

for PROD, DEV and UAT environment as applicable

Is Bank expexting to provide Infra for Sanbox

enviorment ? Please confirm.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

357 Page 33 Point 10 Vendor must provide detailed configuration of the

proposed Hardware, including Hosting Space

Requirements, Racks, Power, Cooling and any other

requirement for the fulfillment of the Vendor’s

obligation in this EOI

What will be the per rack power availability (in kVA).

Please confirm

Bidder to propose required power per rack

358 Page 33 Point 11 Vendor will supply hardware resources and related

services at the desired locations (Production and

DR)

Is Bank expexting to provide same manpower

resources at DR siteas compare to DC to maintain the

Infra. Please confirm.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

359 Page 34 Point 16 The Vendor shall also carry out OS Hardening, Anti-

Virus installation, Create Super user for the

Production, DR and UAT/Dev environment

according to Bank’s policy and secured

configuration document

Is Bank expracting from bidder to supply Antivirus

soltion ( License), Antivirus servers ? OR Bank will

provide AV license. Please confirm

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

360 Page 35 Point 23 System Administration Support- Service Provider

must provide 24X7 supports for the Administration,

Maintenance, Up-gradation (related to hardware)

and other related activity to keep system running

so that high availability can be assured

Is Bank expecting 24x7 onsite support from bidder for

prposed infrastrcture ? OR bidder can provide remote

support. Please confirm

Proposal should include 24x7 onsite support

361 Page - 36 Point 44 Vendor need to propose a solution for data

migration / transfer between Existing DWH (Navi

Mumbai Location 1) and NEXT-GEN DW-PR (Navi

Mumbai Location 2) and also between NEXT-GEN

DW-PR (Navi Mumbai Location 2) and Hyderabad

(DR) or any other places for PR and DR decided by

the Bank.

We undersatnd that there will no requirment of Near

DR. Please confirm.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

362 Page - 36 Point 45 Database should be linearly scalable which can

expand the database capacity by just adding more

nodes to the existing database. If the data volume

grows more hardware can be added and expand

the database capacity

Is Bank expecting to supply DB license OR Bank will

provide DB license under Bank EULA with OEM ?

Please confirm

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

Page 21: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

363 Page - 36 Point 48 The vendor should provide EXACT size needed for

production in the 1st year and estimated sizes for

consecutive years keeping in view the growth rate

predicted by Bank in this section and provide

empirical evidence for the calculation of growth

rate.

Is Bank expecting to provide Hardware only for 1st

Year ? OR Bank expecting to provide Hardware with

five year sizing. Please confirm

Solution should include hardware sizing for complete duration of the project

364 Annex-D Page 39 270+ TB Data with compression index of 2.5 Is Bank using any Archival software with Archival

Appliance to archive the Data . Please confrim.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

365 Annex-G Page 46 Prod in TB Is bank expecting to proivde storage capacity for the

1st Year OR bidder has to provide the storage with 5th

Year. Please confirm

Bidder to propose the storage requirements from Year 1 to Year 5.

However, Bidders are expected to propose storage forecast over and above

given sizing in the solution to ensure fast performance of system.

366 24 Critical Functional Requirements Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups. It should be

possible to quickly create and revoke the identity of

a user or a service by simply deleting or disabling

the account in the directory. Multi-factor

authentication is desired as an additional layer of

security for user sign-in and transactions.

Do We need to Provide 2 factor Authentication

solutions or we need to just integrate with Existing SBI

2 factor Solutions? Please clarify

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

367 Security Do we need to provide entire Security solutions or can

we leverage any Existing SBI Security devices?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

368 34 point 16 The Vendor shall ensure all Installations &

Implementation to be done by OEM badged

racks for hosting including all required cabling & all

other activities required for installation of

hardware

Please clarify, whether Bidder to just provide

immediate connecting Networking Switches while SBI

to provide rest of the other needed infrastuctures

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

369 34 point 16 Installation and Configuration of Network

equipment

Please clarify, whether bidder to provide even

Networking Racks and passive cabling?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

370 35 The Hardware solution must be compatible to

integrate with various systems in the Bank

including but not limited to SOC, PIMS, NOC,

Command Centre, ITAM, Service Desk, ADS, and

SSO etc. at no extra cost. Vendor will have to give

appropriate support to the Bank during integration

with various components of IT environment.

Can Bidder leverage Existing PIM / DLP /WAF/

Firewall/IPS/IDS /LB /DAM solution deployed at SBI

will be extended for this engagement.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

371 13 Annexure B - Tech Rackspace, Power, Cooling, Network connectivity & Bidder to just provide the Replications and Internet

Bandwidth sizing while SBI will procure the same along

with necessary Routers

This information is not required at this stage of EOI.

372 34 All work related to patch panels will be done by

Vendor.

Bidder to just provide the Number of Networking

ports requires while SBI will Procure th Necessary

Networking switches

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

373 34 point 16 Installation and Configuration of Security

equipment

Do we need to provide Encryption/Decryption solution

other then Hadoop Native encryption?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

374 34 point 16 Installation and Configuration of Security

equipment

Do we need to provide the HSM Solutions if Data in

rest or data in motion is in non-Hadoop?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

375 34 point 16 Installation and Configuration of Security

equipment

Do we need to provide the PKI based / Token based

Authentications?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

376 34 point 16 Installation and Configuration of Security

equipment

do we need to proposed the DAM (Data base activity

Monitoring) ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

377 The hardware will be delivered in a staggered

manner and Vendor to provide a plan for the same

Please clarify this statement? As OEM don't provide

Commercials Valid for longer durations?

Commercials are not required at this stage of EOI

378 What all the Infra Monitoring Tools we can Leverage?

Please clarify

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

379 33 The proposed hardware must not fall into ‘End of

Support’ for at least 7 years from the date of

delivery to the Bank.

How many Years Total Contract will be? This information is not required at this stage of EOI.

380 33 point number 12 The Vendor is required to supply, install, test,

commission, monitor, manage and maintain the IT

System along with operating system and other

peripherals with one-year warranty and AMC for 4

years from the date of delivery at data centers

advised by the Bank

How Many years Post Implementation support

needed? Please clarify

This information is not required at this stage of EOI.

381 The proposed hardware is mission critical for the

proposed project and support of 24 X 7 with an

uptime of 99.99 % to be ensured by providing

support at PR, and DR site for a period of 5 years.

Does SBI requires total 5 Years support after Go-Live?

Please clarify

This information is not required at this stage of EOI.

382 13 Annexure B Technical Criteria/Scope

of Work

Detailed Migration Plan including timelines from

existing to new setup

The detailed Migration plan and timelines can only be

provided after completely understanding the existing

DWH solution.

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan

383 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. What is the current ETL tool being used by the bank?

Does the Bank expect vendor to propose the same

tool?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

384 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. Request the bank to elaborate more on this

statement. Does the Bank expect that the existing ETL

jobs keep running in the current DWH even after the

new DWH is functional?

Migrated ETL jobs will run on Next-Gen DW and are expected to be fine tuned

on new setup

385 14 Data Ingestion, Point 15 Vendor should list out all types of risks they expect

from the ingestion subsystem (e.g., dropping of

data packets during ingestion, security loopholes,

unprotected personally identifiable information,

etc.) along with mechanisms and processes they

would implement for mitigating such risks.

The detailed list of risks can be suggested only when

we have a detailed understanding of the banks current

systems.

Bidder to propose how best they can fulfill the requirement considering their

experience with Banking data.

Page 22: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

386 15 Data Ingestion, Point 16 Proposed solution should be able to scrap

encrypted log, capture Metadata changes at source

level completely, scrapping 4000-5000 logs daily

having log size of ~ 2 TB each scalable up to 10000

logs. Proposed solution should be capable of

scrapping logs generated by any type of Database.

E.g. Oracle Database, IBM DB2 Database etc.

What is the mechanism of scrapping the logs in the

current DWH? Request bank to share the tool details.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

387 15 Data Ingestion, Point 17 Solution should be able to handle DDL change

without manual reorg/runstat. It should handle

network fluctuations and hindrances.

We understand that if there is any structual change in

a object at the source which is being ingested in the

new system, then that change if required needs to be

percolated to the new system and will need a manual

intervention. Please confirm our understanding.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

388 15 Data Ingestion, Point 21 Vendor should propose which technology is

suitable for each kind of upstream data ingestion

like Data Warehouse, Data Marts, Data Lake, Use

Data Virtualization/Federation layer, etc.

Request bank to elabrate on this requirement. Solution proposed by the Bidder should be capable of ingesting any kind of

data into appropriate components like Data Warehouse, Data Lake, Data

Marts or use of Data Virtualization depending on the requirement

389 16 Data Storage, Point 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Request the bank to elaborate on the kind of queries

the end users will be making on the storage layer. Also

please elaborate on the different types of users that

will be accessing the storage layer.

This information is not required at this stage of EOI.

390 16 Data Storage, Point 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases.

Please provide a list of all downstream applications

which will be consuming data from the data storage

layers.

This information is not required at this stage of EOI.

391 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the source systems from which data will be

ingested in real time?

This information is not required at this stage of EOI.

392 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the use cases for real time reporting? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

393 17

Data Processing Framework, Point 6

Have a workflow management and scheduling

solution to schedule data transformation, data

acquisition or data delivery jobs.

Allocations of separate workload channel to

designated queries

Request bank to elabrate on this requirement. This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

394 17

Data Processing Framework, Point 12

ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Request bank to elaborate more on this requirement This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

395 18 Data Processing Framework, Point 18 Transformations for this activity can be categorized

into the following types:

migrated to NEXT-GEN DW

not sourced by DWH

real time data capture

Request Bank to provide the details of the existing

transformations that needs to be migrated into the

NEXT-GEN DW.

This information is not required at this stage of EOI.

396 18 Migration from Existing Setup to

Proposed Solution, Point 1

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

The detailed Migration plan and timelines can only be

provided after completely understanding the existing

DWH solution.

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

397 19 Migration from Existing Setup to

Proposed Solution, Point 10

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

The list of risks expected during data migration can be

listed only after understanding the existing DWH

solution.

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

398 19 Migration from Existing Setup to

Proposed Solution, Point 10

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

The requirement of a downtime can be proposed after

understading the complete ecosystem of the curret

DWH solution.

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

399 19 Migration from Existing Setup to

Proposed Solution, Point 11

Vendor to review the existing architecture during

migration and remove duplication of data and

recommend improvements in overall setup if any

Does the bank intend to continue using the current

DWH solution even after the new solution is live?

Please elaborate.

Bank will take a final call at appropriate time.

400 19 Migration from Existing Setup to

Proposed Solution, Point 12

Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years in staggered manner during the

implementation of Next-Gen DW which will save

cost to the Bank. (Annexure D gives the technology

architecture of the current setup)

Request bank to share the complete technical stack of

the current DWH in order for us to suggest the best

feasile plan for the existing infrastructure.

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

401 20 Cloud Integration and Migration,

Point 1

NEXT-GEN DW should be able to consume data

from external cloud-based infrastructures.

Request bank to share the complete information

about the external cloud based source systems from

which data needs to be consumed in the new system

This information is not required at this stage of EOI.

Page 23: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

402 20 Cloud Integration and Migration,

Point 5

In view of the intent to reduce the hardware

footprint (in future), the technical architecture of

NEXT-GEN DW solution should be flexible to

accommodate provisioning of NEXT-GEN DW on

cloud. The Bank understands that there can be

differences in services offered by cloud service

providers. The NEXT-GEN DW solution architecture

should be designed considering as-is infrastructure

availability in cloud.

We understand that the proposed solution should be

compatible as IAAS on the cloud platform. Please

confirm our understanding.

Yes. At present, as per the Bank's IS policy migrating/storing data in public

cloud is not permitted. However bidders may propose, as an alternative, use

of cloud (public cloud, private cloud, on-premise etc) in addition to the best

integrated proposed solution.

403 20 Cloud Integration and Migration,

Point 6

Adherence to global standards related to cloud Request bank to share the global standards mentioned

in this requirement.

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

404 21 Monitoring Dashboard, Point 16 Back Dated Data changes needs to be updated on

portal

Which portal is being reffered to in this statement? Portal refers to Monitoring Dashboard

405 22 Data Governance, Point 12 Metadata Management Capability: Tool should

cater to three broad categories of metadata;

Business metadata, Technical metadata and

Operational metadata

Does the current DWH solution have a Metadata

Management Solution? If yes, please share the tool

used .

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

406 22 Data Governance, Point 12 Metadata Management Capability: Tool should

cater to three broad categories of metadata;

Business metadata, Technical metadata and

Operational metadata

If there is an existing Metadata Management Solution,

do we need to migrate the existing metadata to the

new solution.

Yes

407 22 Data Governance, Point 13 Masterdata Management Capability: Master Data

Management tool (s) should deliver consolidated,

complete and accurate view of business-critical

master information to all the operational and

analytical systems across the Bank.

Does the current DWH solution have a MDM Solution?

If yes, please share the tool used .

No, DWH doesn’t have MDM solution

408 22 Data Governance, Point 13 Masterdata Management Capability: Master Data

Management tool (s) should deliver consolidated,

complete and accurate view of business-critical

master information to all the operational and

analytical systems across the Bank.

Request bank to elaborate more on the MDM solution

that is required.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

409 23 Data Quality, Point 12 Mechanism to capture feedback from end users to

report Data Quality issues

Request bank to elaborate more on this requirement Refer to Sr.No. #318

410 27 Data Encryption, Point 5 The overall SLA for data processing should be

adhered to, keeping data encryption as an

important activity.

What is the SLA for a data processing job? Please

provide details of all the SLA's that will be applicable

on the new system.

This information is not required at this stage of EOI.

411 27 Data Encryption, Point 6 Proposed Solution should be capable on ingesting

encrypted data from source system. It should

support the encryption/decryption mechanism

implemented at source system.

Request bank to share the encryption/decryption

mechanism implemented at source system.

This information is not required at this stage of EOI.

412 28 Downstream Data Consumption Self-service portal to extract the data on their own

(Should support Data Democratization)

Request bank to elaborate more on this requirement.

Does the vendor need to propose a portal solution for

this requirement?

Yes, Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

413 30 Business Intelligence Tools, Point 5 Visualizations: BI tools must provide below

different types of visualizations;

Please provide the use case for vizualizations having

animations.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

414 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the use case for reporting on

Multimedia data.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

415 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different

criteria defined for each type of user, such as

information consumer, business analyst and IT.

What are the different types of users that will be

accessing the Business Intelligence Solution?

Refer to Hardware Specifications section starting on page # 32

416 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different

criteria defined for each type of user, such as

information consumer, business analyst and IT.

Please provide count of such BI Power Users and BI

Recipient users.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

417 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

Request bank to elaborate more on this requirement

with an example.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

418 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

What is the peak user concurrency? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

419 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

What is the expected user growth year on year for the

next 5 years?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

420 32 Business Intelligence Tools, Point 31 There should be separate criteria for BI user online

help versus technical documentation.

We understand that a user guide needs to be provided

for the business users. Please confirm our

understanding.

Yes

421 32 Business Intelligence Tools, Point 31 Ability to handle and summarize huge volumes of

data. E.g. 30-40 million rows accessed on index and

summarized over 5 to 8 metrics.

Please share thedetailed list of SLA applicable for the

new system.

This information is not required at this stage of EOI.

Page 24: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

422 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract for SBI INVM

Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)

* 31 (days) = 4212.9 (GB) Monthly File =>

2TB(Approx)

We assume that there are a total of 150 daily files and

1 monthly file that needs to be ingested into the new

system. Please confirm.

Please refer to Data Ingestion point # 20, page 15

423 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract for SBI INVM

Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)

* 31 (days) = 4212.9 (GB) Monthly File =>

2TB(Approx)

What is the normal refresh time for daily batch?

Please provide details about average start time and

end time for various stages of ETL, i.e. Source to

Staging, Staging to EDW, Aggregations, etc.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

424 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

We assume that there are a total of 4 file extracts

from the PSG system that needs to be ingested weekly

into the new warehouse. Please confirm

Please refer to Data Ingestion point # 20, page 15

425 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

We assume that there are a total of 4 file extracts

from the SBI Card system that needs to be ingested

weekly into the new warehouse. Please confirm

Please refer to Data Ingestion point # 20, page 15

426 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

Please provide the clarity of how the data will be

provided from ATM and INB systems.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given Data Ingestion, point #6 and point # 20 in the

EOI.

427 38 Annexure C - Monthly Data

processed in DWH Warehouse

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Monthly 200 GB(Approx)

How will the data be provided from the mentioned

source systems?

A) Please elaborate on the number of files/tables to

be ingested.

B) What will be the frequency by which the data will

be provided?

C) What is the format of data feeds coming from

various source systems? Is it delimited files Push or

RDBMS data read by ETL tool or any other method?

Please provide details.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given Data Ingestion, point #6 and point # 20 in the

EOI.

428 39 Annexure D - Existing Data

Warehouse Architecture

Database appliance

500+ TB of data with compression index of 2.5

We understand that the total compressed data in the

data warehouse is 500TB. What is the maximum

storage capacity of the current data warehouse

infrastructure?

This information is not required at this stage of EOI.

429 39 Annexure D - Existing Data

Warehouse Architecture

Data Sourcing and Extraction Jobs

17000 Data sourcing and data extraction jobs

What is the breakup of jobs

complexity(Simple/Medium/Complex) wise?

For the purpose of this EOI consider all jobs as complex in nature

430 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

We understand that this provides the number of

reports being generated from the current DWH and

needs to be migrated to the new system. Please

confirm.

These are indicative numbers given for reference. Actual numbers may

change in future.

431 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

What is the breakup of reports

complexity(Simple/Medium/Complex) wise?

For the purpose of this EOI consider all reports as complex in nature

432 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

Are there any new reports or dashboards that need to

be created in the new system. Please provide the

number of reports and dashboards along with the

complexity (Simple/Medium/Complex) breakup.

Yes, new reports and dashboards will be required to be created in new

system as and when required in future. Further details are not required at

this point of time. Bidders are free to propose solution(s) in the best interest

of the Bank to meet the requirements given in the EOI.

433 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

How many years of data has to be there in the

Interactive reports?

Interactive reports might be run on complete data set depending on future

requirements

434 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

Is there a requirement for Dashboards? If yes, please

provide the

Please refer to Business Intelligence Tools on page # 30

435 25 Regulatory Reporting, Point 1 Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Is it the requirement of SDMX format reporting? This information is not required at this stage of EOI.

436 25 Regulatory Reporting, Point 5 Data Lineage and Transparency – Tool should

retrace the journey of the source data through

every single workflow processes or calculations

across siloes systems all the way to disclosures.

Request bank to elaborate more on this requirement. This is an industry standard terminology

437 26 Regulatory Reporting, Point 11 Pre-submission Review – Multiple report writers

should allow users to review reports in various

formats before submission, with the ability to drill

down and make manual adjustments where

necessary.

What type of system is expected for review? Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

438 26 Regulatory Reporting, Point 12 Tool must support generation of reports in XBRL

format

What is the current mechanism being used to

generate XBRL format?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

439 22 Data Governance, Point 7 Automated propagation of changes to NEXT-GEN

DW Data Dictionary and business glossary by

multiple sources as and when changes occur in

source.

What level details are expected in Data dictionary and

business gloassary?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

440 14 Data Ingestion, point 7 Data sanity checks, automated reject processing,

validations and reconciliation of data should be

available as part of data ingestion solution to

ensure the integrity of data.

Automated reject processing is feasible for known

issues, however unforeseen one need to follow testing

and validation process.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

441 21 Data Governance Provide traceability – it should be possible to track

and visualize any data transformation or any rule

applied to data in the source system -> Next Gen

DWH -> Downstream systems

No query asked.

Page 25: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

442 24 Security and Compliance Authentication and Identity Management - TCS Understand bank looking IAM solution for user life

cycle management, Can Vendor consider new IAM

tool deployemt for Next-Gen DW

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

443 25 Security and Compliance Data protection - It should be possible to protect

the data in the NEXT-GEN DW throughout its

lifecycle including data at rest and data in motion.

Data Leakage - Security CIA parameters should be

achieved, and tools should be able to find and alert

on Data leakage

Data Clasfication done for Next-Gen DW ? Is this bank

looking deployment of DLP tool ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

444 25 Security and Compliance Auditing - Audit or diagnostic logs should be used

to log management-related activities or data-

related activities. Log management and auditing of

all critical activities on NEXT-GEN DW is a critical

requirement. The Bank reserves right to ask the

vendor to produce / analyze logs for reporting

purposes.

TCS Understand Bank is looking DAM solution, Is this

understanding is correct ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

445 13 Annexure B Technical Criteria/Scope

of Work

Detailed Migration Plan including timelines from

existing to new setup

The detailed Migration plan and timelines can only be

provided after completely understanding the existing

DWH solution.

Duplicate Query. Refer to Sr.No. #396

446 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. What is the current ETL tool being used by the bank?

Does the Bank expect vendor to propose the same

tool?

Duplicate Query. Refer to Sr.No. #383

447 14 Data Ingestion, Point 8 Existing ETL Jobs to be Fine Tuned. Request the bank to elaborate more on this

statement. Does the Bank expect that the existing ETL

jobs keep running in the current DWH even after the

new DWH is functional?

Duplicate Query. Refer to Sr.No. #384

448 14 Data Ingestion, Point 15 Vendor should list out all types of risks they expect

from the ingestion subsystem (e.g., dropping of

data packets during ingestion, security loopholes,

unprotected personally identifiable information,

etc.) along with mechanisms and processes they

would implement for mitigating such risks.

The detailed list of risks can be suggested only when

we have a detailed understanding of the banks current

systems.

Duplicate Query. Refer to Sr.No. #385

449 15 Data Ingestion, Point 16 Proposed solution should be able to scrap

encrypted log, capture Metadata changes at source

level completely, scrapping 4000-5000 logs daily

having log size of ~ 2 TB each scalable up to 10000

logs. Proposed solution should be capable of

scrapping logs generated by any type of Database.

E.g. Oracle Database, IBM DB2 Database etc.

What is the mechanism of scrapping the logs in the

current DWH? Request bank to share the tool details.

Duplicate Query. Refer to Sr.No. #386

450 15 Data Ingestion, Point 17 Solution should be able to handle DDL change

without manual reorg/runstat. It should handle

network fluctuations and hindrances.

We understand that if there is any structual change in

a object at the source which is being ingested in the

new system, then that change if required needs to be

percolated to the new system and will need a manual

intervention. Please confirm our understanding.

Duplicate Query. Refer to Sr.No. #387

451 15 Data Ingestion, Point 21 Vendor should propose which technology is

suitable for each kind of upstream data ingestion

like Data Warehouse, Data Marts, Data Lake, Use

Data Virtualization/Federation layer, etc.

Request bank to elabrate on this requirement. Duplicate Query. Refer to Sr.No. #388

452 16 Data Storage, Point 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Request the bank to elaborate on the kind of queries

the end users will be making on the storage layer. Also

please elaborate on the different types of users that

will be accessing the storage layer.

Duplicate Query. Refer to Sr.No. #389

453 16 Data Storage, Point 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases.

Please provide a list of all downstream applications

which will be consuming data from the data storage

layers.

Duplicate Query. Refer to Sr.No. #390

454 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the source systems from which data will be

ingested in real time?

Duplicate Query. Refer to Sr.No. #391

455 17 Data Processing Framework, Point 2 The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the use cases for real time reporting? Duplicate Query. Refer to Sr.No. #392

456 17

Data Processing Framework, Point 6

Have a workflow management and scheduling

solution to schedule data transformation, data

acquisition or data delivery jobs.

Allocations of separate workload channel to

designated queries

Request bank to elabrate on this requirement. Duplicate Query. Refer to Sr.No. #393

457 17

Data Processing Framework, Point 12

ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Request bank to elaborate more on this requirement Duplicate Query. Refer to Sr.No. #394

458 18 Data Processing Framework, Point 18 Transformations for this activity can be categorized

into the following types:

migrated to NEXT-GEN DW

not sourced by DWH

real time data capture

Request Bank to provide the details of the existing

transformations that needs to be migrated into the

NEXT-GEN DW.

Duplicate Query. Refer to Sr.No. #395

459 18 Migration from Existing Setup to

Proposed Solution, Point 1

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

The detailed Migration plan and timelines can only be

provided after completely understanding the existing

DWH solution.

Duplicate Query. Refer to Sr.No. #396

Page 26: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

460 19 Migration from Existing Setup to

Proposed Solution, Point 10

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

The list of risks expected during data migration can be

listed only after understanding the existing DWH

solution.

Duplicate Query. Refer to Sr.No. #397

461 19 Migration from Existing Setup to

Proposed Solution, Point 10

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

The requirement of a downtime can be proposed after

understading the complete ecosystem of the curret

DWH solution.

Duplicate Query. Refer to Sr.No. #398

462 19 Migration from Existing Setup to

Proposed Solution, Point 11

Vendor to review the existing architecture during

migration and remove duplication of data and

recommend improvements in overall setup if any

Does the bank intend to continue using the current

DWH solution even after the new solution is live?

Please elaborate.

Duplicate Query. Refer to Sr.No. #399

463 19 Migration from Existing Setup to

Proposed Solution, Point 12

Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years in staggered manner during the

implementation of Next-Gen DW which will save

cost to the Bank. (Annexure D gives the technology

architecture of the current setup)

Request bank to share the complete technical stack of

the current DWH in order for us to suggest the best

feasile plan for the existing infrastructure.

Duplicate Query. Refer to Sr.No. #400

464 20 Cloud Integration and Migration,

Point 1

NEXT-GEN DW should be able to consume data

from external cloud-based infrastructures.

Request bank to share the complete information

about the external cloud based source systems from

which data needs to be consumed in the new system

Duplicate Query. Refer to Sr.No. #401

465 20 Cloud Integration and Migration,

Point 5

In view of the intent to reduce the hardware

footprint (in future), the technical architecture of

NEXT-GEN DW solution should be flexible to

accommodate provisioning of NEXT-GEN DW on

cloud. The Bank understands that there can be

differences in services offered by cloud service

providers. The NEXT-GEN DW solution architecture

should be designed considering as-is infrastructure

availability in cloud.

We understand that the proposed solution should be

compatible as IAAS on the cloud platform. Please

confirm our understanding.

Duplicate Query. Refer to Sr.No. #402

466 20 Cloud Integration and Migration,

Point 6

Adherence to global standards related to cloud Request bank to share the global standards mentioned

in this requirement.

Duplicate Query. Refer to Sr.No. #403

467 21 Monitoring Dashboard, Point 16 Back Dated Data changes needs to be updated on

portal

Which portal is being reffered to in this statement? Duplicate Query. Refer to Sr.No. #404

468 22 Data Governance, Point 12 Metadata Management Capability: Tool should

cater to three broad categories of metadata;

Business metadata, Technical metadata and

Operational metadata

Does the current DWH solution have a Metadata

Management Solution? If yes, please share the tool

used .

Duplicate Query. Refer to Sr.No. #405

469 22 Data Governance, Point 12 Metadata Management Capability: Tool should

cater to three broad categories of metadata;

Business metadata, Technical metadata and

Operational metadata

If there is an existing Metadata Management Solution,

do we need to migrate the existing metadata to the

new solution.

Duplicate Query. Refer to Sr.No. #406

470 22 Data Governance, Point 13 Masterdata Management Capability: Master Data

Management tool (s) should deliver consolidated,

complete and accurate view of business-critical

master information to all the operational and

analytical systems across the Bank.

Does the current DWH solution have a MDM Solution?

If yes, please share the tool used .

Duplicate Query. Refer to Sr.No. #407

471 22 Data Governance, Point 13 Masterdata Management Capability: Master Data

Management tool (s) should deliver consolidated,

complete and accurate view of business-critical

master information to all the operational and

analytical systems across the Bank.

Request bank to elaborate more on the MDM solution

that is required.

Duplicate Query. Refer to Sr.No. #408

472 23 Data Quality, Point 12 Mechanism to capture feedback from end users to

report Data Quality issues

Request bank to elaborate more on this requirement Refer to Sr.No. #318

473 27 Data Encryption, Point 5 The overall SLA for data processing should be

adhered to, keeping data encryption as an

important activity.

What is the SLA for a data processing job? Please

provide details of all the SLA's that will be applicable

on the new system.

Duplicate Query. Refer to Sr.No. #410

474 27 Data Encryption, Point 6 Proposed Solution should be capable on ingesting

encrypted data from source system. It should

support the encryption/decryption mechanism

implemented at source system.

Request bank to share the encryption/decryption

mechanism implemented at source system.

Duplicate Query. Refer to Sr.No. #411

475 28 Downstream Data Consumption Self-service portal to extract the data on their own

(Should support Data Democratization)

Request bank to elaborate more on this requirement.

Does the vendor need to propose a portal solution for

this requirement?

Duplicate Query. Refer to Sr.No. #412

476 30 Business Intelligence Tools, Point 5 Visualizations: BI tools must provide below

different types of visualizations;

Please provide the use case for vizualizations having

animations.

Duplicate Query. Refer to Sr.No. #413

477 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the use case for reporting on

Multimedia data.

Duplicate Query. Refer to Sr.No. #414

478 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different

criteria defined for each type of user, such as

information consumer, business analyst and IT.

What are the different types of users that will be

accessing the Business Intelligence Solution?

Duplicate Query. Refer to Sr.No. #415

Page 27: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

479 32 Business Intelligence Tools, Point 26 Ease of analytical use: There should be different

criteria defined for each type of user, such as

information consumer, business analyst and IT.

Please provide count of such BI Power Users and BI

Recipient users.

Duplicate Query. Refer to Sr.No. #416

480 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

Request bank to elaborate more on this requirement

with an example.

Duplicate Query. Refer to Sr.No. #417

481 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

What is the peak user concurrency? Duplicate Query. Refer to Sr.No. #418

482 32 Business Intelligence Tools, Point 28 The best practice is to create several prebuilt query

scenarios and compare how each product performs

based on these specific examples. The worse

practice is to just arbitrarily rate the speed.

What is the expected user growth year on year for the

next 5 years?

Duplicate Query. Refer to Sr.No. #419

483 32 Business Intelligence Tools, Point 31 There should be separate criteria for BI user online

help versus technical documentation.

We understand that a user guide needs to be provided

for the business users. Please confirm our

understanding.

Duplicate Query. Refer to Sr.No. #420

484 32 Business Intelligence Tools, Point 31 Ability to handle and summarize huge volumes of

data. E.g. 30-40 million rows accessed on index and

summarized over 5 to 8 metrics.

Please share thedetailed list of SLA applicable for the

new system.

Duplicate Query. Refer to Sr.No. #421

485 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract for SBI INVM

Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)

* 31 (days) = 4212.9 (GB) Monthly File =>

2TB(Approx)

We assume that there are a total of 150 daily files and

1 monthly file that needs to be ingested into the new

system. Please confirm.

Duplicate Query. Refer to Sr.No. #422

486 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract for SBI INVM

Daily File =>151 (Total Files) * 0.9 (GB Avg File Size)

* 31 (days) = 4212.9 (GB) Monthly File =>

2TB(Approx)

What is the normal refresh time for daily batch?

Please provide details about average start time and

end time for various stages of ETL, i.e. Source to

Staging, Staging to EDW, Aggregations, etc.

Duplicate Query. Refer to Sr.No. #423

487 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

We assume that there are a total of 4 file extracts

from the PSG system that needs to be ingested weekly

into the new warehouse. Please confirm

Duplicate Query. Refer to Sr.No. #424

488 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

We assume that there are a total of 4 file extracts

from the SBI Card system that needs to be ingested

weekly into the new warehouse. Please confirm

Duplicate Query. Refer to Sr.No. #425

489 38 Annexure C - Monthly Data

processed in DWH Warehouse

Flat file extract from ATM, INB, PSG, and SBI Card

1.PSG Weekly=>102 MB per week*4=408 MB (Avg

File Size)

2.SBI Card Weekly=>1 GB per week*4=4 GB (Avg

File Size)

3. [~22 GB per day*31=~682 GB]load for file

ATM,INB base]=1147

Please provide the clarity of how the data will be

provided from ATM and INB systems.

Duplicate Query. Refer to Sr.No. #426

490 38 Annexure C - Monthly Data

processed in DWH Warehouse

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Monthly 200 GB(Approx)

How will the data be provided from the mentioned

source systems?

A) Please elaborate on the number of files/tables to

be ingested.

B) What will be the frequency by which the data will

be provided?

C) What is the format of data feeds coming from

various source systems? Is it delimited files Push or

RDBMS data read by ETL tool or any other method?

Please provide details.

Duplicate Query. Refer to Sr.No. #427

491 39 Annexure D - Existing Data

Warehouse Architecture

Database appliance

500+ TB of data with compression index of 2.5

We understand that the total compressed data in the

data warehouse is 500TB. What is the maximum

storage capacity of the current data warehouse

infrastructure?

Duplicate Query. Refer to Sr.No. #428

492 39 Annexure D - Existing Data

Warehouse Architecture

Data Sourcing and Extraction Jobs

17000 Data sourcing and data extraction jobs

What is the breakup of jobs

complexity(Simple/Medium/Complex) wise?

Duplicate Query. Refer to Sr.No. #429

493 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

We understand that this provides the number of

reports being generated from the current DWH and

needs to be migrated to the new system. Please

confirm.

Duplicate Query. Refer to Sr.No. #430

494 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

What is the breakup of reports

complexity(Simple/Medium/Complex) wise?

Duplicate Query. Refer to Sr.No. #431

495 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

Are there any new reports or dashboards that need to

be created in the new system. Please provide the

number of reports and dashboards along with the

complexity (Simple/Medium/Complex) breakup.

Duplicate Query. Refer to Sr.No. #432

496 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

How many years of data has to be there in the

Interactive reports?

Duplicate Query. Refer to Sr.No. #433

497 39 Annexure D - Existing Data

Warehouse Architecture

Reporting

300+ Interactive reports

100+ Busted reports

Is there a requirement for Dashboards? If yes, please

provide the

Duplicate Query. Refer to Sr.No. #434

498 25 Regulatory Reporting, Point 1 Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Is it the requirement of SDMX format reporting? Duplicate Query. Refer to Sr.No. #435

499 25 Regulatory Reporting, Point 5 Data Lineage and Transparency – Tool should

retrace the journey of the source data through

every single workflow processes or calculations

across siloes systems all the way to disclosures.

Request bank to elaborate more on this requirement. Duplicate Query. Refer to Sr.No. #436

Page 28: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

500 26 Regulatory Reporting, Point 11 Pre-submission Review – Multiple report writers

should allow users to review reports in various

formats before submission, with the ability to drill

down and make manual adjustments where

necessary.

What type of system is expected for review? Duplicate Query. Refer to Sr.No. #437

501 26 Regulatory Reporting, Point 12 Tool must support generation of reports in XBRL

format

What is the current mechanism being used to

generate XBRL format?

Duplicate Query. Refer to Sr.No. #438

502 22 Data Governance, Point 7 Automated propagation of changes to NEXT-GEN

DW Data Dictionary and business glossary by

multiple sources as and when changes occur in

source.

What level details are expected in Data dictionary and

business gloassary?

Duplicate Query. Refer to Sr.No. #439

503 14 Data Ingestion, point 7 Data sanity checks, automated reject processing,

validations and reconciliation of data should be

available as part of data ingestion solution to

ensure the integrity of data.

Automated reject processing is feasible for known

issues, however unforeseen one need to follow testing

and validation process.

Duplicate Query. Refer to Sr.No. #440

504 30 Business Intelligence Tools, Point 6 Reporting on all types of available of Data Formats; Please provide the details of the sources from which

unstructured and semi-structured data will be

ingested.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

505 38 Annexure C - Monthly Data

processed in DWH Warehouse

Source System Details We understand that this Annexure provides the details

of the sources which are being ingested in the current

data warehouse and the same needs to be integrated

into the new system. Please confirm our

understanding.

Please refer to Annexure C for more details

506 38 Annexure C - Monthly Data

processed in DWH Warehouse

Source System Details Are there any new soucre systems that will need to be

ingested into the new system? For each of the source

system please provide the following details

a) Please provide the number of files/tables to be

ingested from the new source systems.

b) Will the data be ingested in real time or batch mode

?

c) What will be the frequency by which the data will

be provided?

d) What is the format of data feeds coming from

various source systems? Is it delimited files Push or

RDBMS data read by ETL tool or any other method?

e) What will be the volume of data that will be

ingested daily?

f) What will be the data volume growth year on year

for the next 5 years?

g) What is the current volume of data in the source

system?

Please refer to Annexure C for more details

507 13 Data Ingestion - Point 1 Capable of ingesting data from any source system

in automated manner currently implemented in the

Bank, or any future standard source systems that

the Bank will decide to use with high throughput

and low latency. Vendor to propose performance

benchmarking for the same.

Please elaborate more "Vendor to propose

performance benchmarking for the same."

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

508 14 Data Ingestion - Point 3 GUI based framework to configure sources to NEXT-

GEN DW

1. Do the exisitng system have GUI framework for base

understanding or reuse it.

2. Or do we need to built altogether new GUI

framework.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

509 14 Data Ingestion - Point 4 Ingestion subsystem should allow to configure

ingestion processes from single / multiple source

system, single / multiple files, single / multiple

operational input files

Is subsystem Ingestion should also be configurable

based on GUI framework

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

510 14 Data Ingestion - Point 8 Existing ETL Jobs to be Fine Tuned. Re-runnablity

checkpoints should be present in ETL jobs. New ETL

jobs should be able to parallel read and write data.

1. How much percentage of ETL Jobs are need to be

fine tuned. Need to understand the existing ETL jobs

performance statistics.

2. Please provide the total no of ETLs and created in

which technology.

3. Please provide statistics on number of existing ETL

jobs and number of BI reports in existing system.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

511 14 Data Ingestion - Point 12 An alerting report and monitoring utility about the

ingest pipelines should be available as part of the

solution.

Are we looking a stand alone monitoring utility and

related reports which will be used for alerting /

notification.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

512 14 Data Ingestion - Point 13 Trigger mechanisms in identifying any structural

changes at source

Please elaborate more This is an industry standard terminology

513 15 Data Ingestion - Point 17 Solution should be able to handle DDL change

without manual reorg/runstat. It should handle

network fluctuations and hindrances.

Please elaborate network fluctuations and hindrances

handling

This is an industry standard terminology

514 16 Data Storage - Point 6 User should be able to work on DB even while

backup is in progress. They should be able to run

statistics and reorganize their tables. Any

background process including backup must not

hamper performance of user queries.

Please elaborate more "They should be able to run

statistics and reorganize their tables. Any background

process including backup must not hamper

performance of user queries"

This is an industry standard terminology

515 17 Data Processing Framework - Point

12

ETL/ELT tool for data extraction should be AI/ML

features for suggesting /

improving Query / ETL / ELT Stages

Please elaborate more his is an industry standard terminology

516 18 Data Federation/Virtualization Point

4

Data virtualization should support the use of APIs. Please elaborate more Proposed solution should have capability of APIs to connect to upstream and

downstream applications

517 18 Migration from Existing Setup to

Proposed Solution Point 3

Data migration from existing archival solution to

new one.

Please elaborate more, also need to understand the

exisitng archival solution

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

518 21 Monitoring Dashboard Point 13 List of daily missing files from source systems Please elaborate more Monitoring dashboard should have capability to automatically flag/highlight

any missed file used for data ingestion from source systems

519 21 Monitoring Dashboard Point 16 Back Dated Data changes needs to be updated on

portal

Please elaborate more Monitoring dashboard should have capability to showcase back-dated data

changes

Page 29: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

520 21 Data Governance Point 5 Capability to classify and store (personal

identifiable information) sensitive data in

encrypted /masked form and should have capability

to decrypt/unmask such information in NEXT-GEN

DW when required by only authorized ID’s.

1. Is Bank having the Encryption/Decryption Utility or

Algorithm or Software which can be reused in the

implementation.

2. Or vendor need to develop or use Off-the-shelf tool.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

521 22 Data Governance Point 11 Data modeling capabilities to be provided by the

tool

Please elaborate more This is an industry standard capability

522 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality Currently, how do you check for data quality issues ?

Describe the current process, tools used if any, and

challenges with the current approach.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

523 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality Please describe the type of data quality issues that

you expect to identify in your production data e.g.

presence of NULLs, invalid format of certain fields,

presence of certain unexpected characters/numbers in

the value of field, etc.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

524 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality How data quality related requirements are being

handled currently ? (e.g. custom script, non-disclosure

agreement, in-house solution etc.). What

tools/processes are currently deployed and what are

their challenges / limitations ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

525 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality What is the business nature of the data, on which,

data quality processes such profiling, cleansing,

deduplication, standardization etc are required to be

executed?

E.g. Customer, Product, Vendor, Item, Security, or any

other. Please describe in detail.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

526 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality Is there any requirement of address standardization ?

If yes, please specify, for which geography(ies) of

address standarization of customer is required ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

527 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality How is Master Data Mangement is being cariied out as

of now, Are there any tools being used? If so are there

any requirements which are not being addressed by

that tool?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

528 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality What is the nature of input(s) - Please provide

complete details:

a. Flat/XML files - Please mention its format and

structure

b. Direct connection to source database - please

specify database technology (e.g. Oracle, MS SQL

Server, DB2, etc.)

c. Data extracted from database into flat files - Please

mention structure

d. Unstructured data - Please provide details

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

529 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality Is there any data quality requirement for special

technologies:

a. Data quality in Mainframe environment

b. Data quality in cloud

c. Data quality in Big Data landscape

d. Data quality in packaged applications such as

SAP/ERP/CRM.

If yes, please provide details, so that we can check the

fitment.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

530 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality How the output from Data Quality Management

service is going to be used? What is the target system?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

531 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Quality What is the volume of Data to be processed for data

profiling and data quality management?

Refer Annexure G for sizing.

532 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements : Data Processing

Framework

Is production data copied 'as is' to test regions, for

testing activity ? Do developers/testers currently have

access to actual production data, in the test region?

No. Please refer to Hardware Specification subsection point number #31 on

page #35

533 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements : Data Processing

Framework

If the data is directly copied from product how is the

sensitivity of information is being taken care?

Refer to Sr. No. #532

534 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements : Data Processing

Framework

Does SBI already has some incumbent home-grown or

third party product for Data Masking Requirement?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

535 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Governance:

Data at rest/in-motion should be encrypted

Is there a need for dyanmic data masking for

following usage scenarios:

-- dynamic masking of web application screens

-- dynamic masking of application, Production logs

-- dynamic masking of database query results

-- document redaction (Masking of data in documents)

Please describe the requirement in depth

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

536 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Governance:

Data at rest/in-motion should be encrypted

Are there any special dynamic masking requirements

such as dynamic masking for SAP screens, Mainframe

screens, Thick client application, third party packaged

applications, etc. ? Please provide all the details that

will enable us to check the fitment.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

537 39 Annexure B Technical Criteria/Scope

of Work

Critical Functional Requirements: Data Governance:

Data at rest/in-motion should be encrypted

In case of dynamic masking of web application

screens, will we have access to the web application

server, to deploy the dynamic masking rules on the

web application server ?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

538 11 Eligibility Criteria - Point 2 Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

EOI is SBI specific requirement ask. What does it

mean to have a Next-Gen DW solution. All

components proposed are implemented by bidder as

part of different engagements with multiple clients. Is

this inline with the bank expectations?

Please refer to Corrigendum.

Page 30: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

539 13 Technical Criteria/Scope of Work Cost model (How the licensing will be done)- Actual

commercials are not required at this stage

Since the bank is looking at on-prim solution and the

components, hardware , software solution need to SBI

specific, what does the bank mean when asking for

licensing cost model? Is it related to the components

annual AMC cost?

Licensing model means Capex, Opex, PVU based, User based licensing, etc.

540 32 Hardware Specifications Detailing will need time. We request for extension of 2

weeks for submission of response to EOI.

No change in timelines of EOI

541 12 Annexure B - Technical Criteria/Scope

of Work

The Following Specifications for each of PROD, DEV,

UAT and DR environments;

Please share the ratio sizing for Dev environment Please refer Annexure-G. Dev to be 10% of sourced data size.

542 Will SBI's existing EDW and new datalake platform run

parallelly or will there be sunset for existing EDW

platform?

This information is not required at this stage of EOI. Bank will take a final call

at appropriate time.

543 What is the current frequency of EDW refresh from

various sources?

This information is not required at this stage of EOI.

544 What is SBI's current technology stack? Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

545 Is SBI ready for Bidder IP based solution? Bidder may propose any suitable solution. Bank will take a final decision in

the best interest of the Bank

546 1. Apart from the existing Data Warehouse and its

current sources, do any other internal sources of data

exist - Excel sheets/ MS Access databases, application

backend databases, other digitized systems?

2. Would this data be also required to be brought into

the Data Lake?

3. What is the volume of such additional data stores?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

547 Physically where are the current data centers located? This information is not required at this stage of EOI.

548 What tools are currently being used for data masking,

metadata management, web scraping?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

549 Is there any migration expected from existing BI tool

to some other BI tool?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

550 What tools are currently being used for visualization,

dashboards, canned reports, adhoc queries?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

551 Detailing of all components post inputs on queries and

additional information regarding current setup, will

need time. We request for extension of 2 weeks for

submission of response to EOI.

No change in timelines of EOI

552 There is mention of Pilot, demo and POC

interchangeably. What is the bank expectation?

Words pilot and demo are not used in EOI.

553 In case we need to do a Pilot or POC. It will take effort,

which can be reviewed separately and agreed upon

post mutual discussions.

No change in standard clause of EOI

554 8 Terms and Conditions i. Lodgement of an EOI is evidence of an applicant’s

consent to comply with the terms and condition of

Request for EOI process and subsequent bidding

process. If an applicant fails to comply with any of

the terms, its EOI may be summarily rejected.

i. Lodgement of an EOI is evidence of an applicant’s

consent to comply with the terms and condition of

Request for EOI process and subsequent bidding

process. If an applicant fails to comply with any of the

terms, its EOI may be summarily rejected. 

No change in standard clause of EOI

555 8 Terms and Conditions ii. Willful misrepresentation of any fact in the EOI

will lead to the disqualification of the applicant

without prejudice to other actions that the Bank

may take. The EOI and the accompanying

documents will become property of SBI. The

applicants shall be deemed to license, and grant all

rights to SBI, to reproduce the whole or any portion

of their product/solution for evaluation, to disclose

the contents of submission to other applicants and

to disclose and/ or use the contents of submission

as the basis for EOI process.

ii. Willful misrepresentation of any fact in the EOI will

lead to the disqualification of the applicant without

prejudice to other actions that the Bank may take. The

EOI and the accompanying documents will become

property of SBI. The applicants shall be deemed to

license, and grant all rights to SBI, to reproduce the

whole or any portion of their product/solution for

evaluation, to disclose the contents of submission to

other applicants and to disclose and/ or use the

contents of submission as the basis for EOI process.

No change in standard clause of EOI

556 32-33 Hardware Specifications 8. The proposed hardware must not fall into ‘End of

Support’ for at least 7 years from the date of

delivery to the Bank.

12. The Vendor is required to supply, install, test,

commission, monitor, manage and maintain the IT

System along with operating system and other

peripherals with one-year warranty and AMC for 4

years from the date of delivery at data centers

advised by the Bank

18. The proposed hardware is mission critical for

the proposed project and support of 24 X 7 with an

uptime of 99.99 % to be ensured by providing

support at PR, and DR site for a period of 5 years.

22. The Hardware solution must be compatible to

integrate with various systems in the Bank

including but not limited to SOC, PIMS, NOC,

Command Centre, ITAM, Service Desk, ADS, and

SSO etc. at no extra cost. Vendor will have to give

appropriate support to the Bank during integration

with various components of IT environment.

8. The proposed hardware must not fall into ‘End of

Support’ for at least 7 years from the date of delivery

to the Bank

12. The Vendor is required to supply, install, test,

commission, monitor, manage and maintain the IT

System along with operating system and other

peripherals with one-year warranty and AMC for 4

years from the date of delivery at data centers advised

by the Bank

The scope of the warranty shall be limited only to

correction of any bugs that were left undetected

during acceptance testing by the Bank. Warranty shall

not cover any enhancements or changes in the

application software, carried out after acceptance

testing. This warranty is only valid for defects against

approved Specifications. The above mentioned

warranty shall also not apply if there is any (i)

combination, operation, or use of some or all of the

deliverables or any modification thereof furnished

hereunder with information, software, specifications,

instructions, data, or materials not approved by

Vendor and operation of the deliverables on

incompatible hardware not recommended by Vendor;

(ii) any change, not made by Vendor, to some or all of

the deliverables; or (iii) if the deliverables have been

tampered with, altered or modified by the Bank

No change in standard clause of EOI

Page 31: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

557 8 TnCs (i) Lodgement of an EOI is evidence of an applicant’s

consent to comply with the terms and condition of

Request for EOI process and subsequent bidding

process

Lodgement of an EOI is evidence of EIT’s consent to

comply with the terms and condition of Request for

EOI process and subsequent bidding process shall be

based on an issue of RFP by Bank & EIT submitting the

Bid response with assumptions if any

No change in standard clause of EOI

558 11 Annex A (2) Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Please validate that vendor is expected to be an SI

who would propose fitting solutions from OEMs

Please refer to Corrigendum.

559 12 Annex B (End State objectives) Scope bullets Please clarify if any of these functionalities can be

used using existing solution in bank. In case yes,

please provide detail of existing licenses detail

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

560 12 Annex B (End State objectives) Vendor may propose one or multiple solutions to

meet the scope of work of this EOI

Please validate that vendors can propose multiple

solutions for each objective mentioned or only one

solution can be proposed for each objective and in the

process no. of OEMs can be more than one. From

subsequent lines, it seems multiple solution options

can be proposed on each objective

Please refer to Corrigendum.

561 14 Functional requirements (7) Data sanity checks, automated reject processing,

validations and reconciliation of data should be

available as part of data ingestion solution to

ensure the integrity of data

Please validate that only validation and reconciliation

reports and limited transformation changes at ETL

level would be expected. Changes required at core

systems end for reconciliation of business figures

would be done by those systems basis validation

reports generated during ETL to new DW. Second last

bullet under 'End State Objectives' on pg 12 indicates a

reconciliation solution is expected (perhaps bank had

purchased GL recon solution as part of OFSAA)

The reconciliation proposed is in respect of reconciliation of data between

Next-Gen DW and upstream/downstream applications as well as within Next-

Gen DW ecosystem

562 15 Data Storage (2) Vendor should propose the type of storage to opt

for in NEXT-GEN DW (SQL, NO-SQL, etc) and

provide details of the hardware requirements,

supported open source / proprietary components

Please clarify if open source solutions can be proposed

as part of total solution of we only need to give details

of open source solutions supported by proposed

proprietary solutions

Please refer to EOI clauses for more details

563 19 Migration from Existing Setup to

Proposed Solution (12)

Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years

To assess re-use of existing infrastructure, complete

details of existing HW and SW would be needed in

Annex D (e.g. type, quantity, version etc.)

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

564 20 Data Archival and Backup Data Archival and Backup Please clarify that new archival and back up solutions

are expected

Yes

565 28 Downstream Data Consumption Facility to generate and distribute canned /

automatic bursted reports from NEXT- GEN DW to

downstream end users like BID, Analytics, CRM,

YONO, OFSAA, etc

Please clarify that next gen DW will provide data to

downstreams like OFSAA. Reports would be generated

out of downstream applications

This information is not relevant to EOI.

566 General Please clarify if Next gen DW will be set up only for SBI

India operations or it will include foreign operations

and subsidiaries also

Current scope of work covers both Domestic and International operations of

the State Bank of India (SBI) Group including subsidiaries also.

567 General Please clarify that existing DW SI / other SIs in SBI will

have any advantage of existing licenses

This information is not required at this stage of EOI. Bank will take a final call

at appropriate time.

568 11 Annexure A - Eligibility Criteria

S.No. 3

The solution should have been

implemented in at least 2 large

scale organizations.

There would possibly be multiple solution components

to meet this EoI requirement. Please clarify if all the

solution components need to have 2 references or

only the main solution components are expected to

have 2 references

Please refer to Corrigendum.

569 13

17

Critical Functional Requirements-

Data Ingestion

Performance benchmark for Next Gen DWH

Capable of ingesting data from any source system

in automated manner currently implemented in the

Bank, or any future standard source systems that

the Bank will decide to use with high throughput

and low latency. Vendor to propose

performance benchmarking for the same.

Performance benchmark of all components of Next-

Gen DW to be given by participating Vendors

Please clarify on the details of the benchmarking so

that this is standard across bidders

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

570 25 7 Electronic Submission – Should support for all

regulators globally in all required

formats, including XBRL, XML or other file-based

electronic submission.

Is there an existing SBI XBRL reporting solution which

can be leveraged here?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

571 24 Security and Compliance Authentication and Identity management Can we leverage the existing SBI SOC/security

investment?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

572 26 Data encryption Can we leverage the existing SBI SOC/security

investment?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

573 27 User Management Can we leverage the existing SBI SOC/security

investment?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

574 18 19 The workflows should work with standard

schedulers. Monitoring and management of

workflows should be possible from an easy to use

interface.

Workflow management tool(s) should have

connectors / pluggable interfaces to already

existing / in-use proprietary software available with

the Bank. These could be (and not restricted to)

data repositories, reporting tools, data analysis

tools and generic interfaces for data transfer.

Scheduled jobs status should be made available to

the Bank in Monitoring dashboard on real time

basis.

Please share some detail on the existing schedulers

used at SBI DW

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

Page 32: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

575 13 19 Tentative Project Timeline Can the bank please specify total implementation

duration ( in months) planned for implementing the

next gen DW and Business priorities which can be

taken up in different phases including retirement of

existing infrastructure

Participating Bidders are expected to propose high level timelines for this

project as clearly mentioned on page number # 13 of this EOI.

576 18 Data Federation/Virtualization Vendor to propose a solution / tool (s) for Data

Federation/Virtualization to ensure seamless

integration of data in real time when stored in

multiple sources without

physical movement of data sets for the purpose of

reporting / analytics

Is there a federation expected between the existing

and next gen DW? What is the likely duration ( no. of

months) for which the federation is expected

Yes, as per the migration timeline proposed by the Bidder.

Virtualization/Federation is also expected in between different components

of Next-Gen DW along with source systems during the entire period of the

project

577 25 Regulatory reporting Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Please clarify if bank would continue to use its ADF/

regulatory reporting solution to provide element

based data to RBI or does bank visualise doing

element based reporting from next gen DW

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

578 28 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Implementing end to end analytics use-cases as

mandated by the Bank

What is the scope of the term "end-to-end analytics"? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

579 28 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS). Migration of

such models to new solution

What are existing analytical models/ algorithms in IBM

SPSS that need to be migrated?

This information is not required at this stage of EOI.

580 28 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Availability Pre-build models which can be directly

used with Bank’s data to get

insights

What are the use cases where the Bank is looking to

use such pre-built models?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

581 29 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Analytics on real-time data in real-time/near real-

time

What are the real-time or near-real-time uses that are

being envisaged?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

582 29 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Vendor to provide solution / tool (s) for below

scope of activities on SBI data sets -

· Social Media Analytics

· Web Analytics

Web Analytics if understood broadly is analysing how

visitors on a site behave, might mean implementing a

Web analytics tool (Adobe Analytics, for example) to

monitor and evaluate traffic to a site. This would

require several tasks of tagging pages, classifying

them, besides web page optimization using A/B

testing etc. This apart, it also includes applying

Advanced Analytics and Machine Learning on batch log

data. The latter will require the use of a Data Science/

Machine Learning platform. In what sense is the Web

Analytics term being used here? A similar clarification

is required for use of Social Media analytics too.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

583 29 Annexure B - Technical Criteria/Scope

of Work. Critical Functional

Requirements, "Data Science

Platform with AI/ML Capabilities"

Annexure E gives sample use cases which are to be

implemented on Next Gen Data Warehouse using

structured and/or unstructured and/or semi-

structured and/or any other kind of data gathered

from either Data Warehouse or Data Lake or Data

Virtualization or all together or any other source.

Is Annexure E just a sample list of use cases that need

to be built or does it represent an exhaustive set of all

areas where models need to be developed?

These are sample use cases build for execution on Next Gen DWH platform

over the period of time for Analytical studies. Bank at its own discretion will

implement new models/use cases on this setup in future.

584 40 Annexure E - Functional Use Cases In general How many of the use cases in Annexure E are already

implemented in the existing DWH and only need to be

migrated and how many of them need to be

implemented from scratch in the New Solution?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

585 40 Annexure E - Functional Use Cases In general Is the expectation to deliver these 32 use cases for the

entire Bank, across its entire customer base and across

all its Businesses? If Model training for these use cases

needs to happen by each customer segment, or

Business (or other groupings) separately, how many

models in total would it translate into?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

586 40 Annexure E - Functional Use Cases In general What are the expected deliverables for the 32 use

cases?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

587 40 Annexure E - Functional Use Cases In general What are the expected timelines / priority of delivery

for all the 32 use cases? Need to help us suggest total

implementation timeline on Next-gen DW

Participating Bidders are expected to propose timelines for this project as

clearly mentioned on page number # 13 of this EOI.

588 13 2 Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Is the NOSQL data mentioned here JSON/XML ? What

kind of processing on NOSQL is expected ? Is there a

need to join the NOSQL data with other relational data

? Is there a need to shred the NOSQL data into

relational data ?

Duplicate Query. Refer to Sr.No. #189

589 16 11 Data Storage

The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Will all these queries be executing in-flight at the

same time ? Or will these be initiated over a period of

time and the aggregate number of queries run during

that time period will be 600,000 ?

Duplicate Query. Refer to Sr.No. #192

590 15 1 Data Storage

Vendor should propose effective number of data

storage layers in NEXT-GEN DW between data

ingestion and data consumption.

In the existing DW solution, are surrogate keys used?

If yes, is there a framework for storage and

management of the keys to ensure robustness of the

data warehouse?

Duplicate Query. Refer to Sr.No. #191

Page 33: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

591 16 13 It should be possible to project and view data

through multiple modes using the Storage on NEXT-

GEN DW. Varieties of GUIs should be available to

project or view the output generated through

analytic processes. For instance: The Bank may

decide to implement use-cases that project

transactions data as a graph data structure. The

Storage solution on NEXT-GEN DW should allow for

such projections.

Do we need to enable graph based analytics as well or

it should be only limited to the facility to access the

data for graph based analytics.

Duplicate Query. Refer to Sr.No. #193

592 17 2 Data Processing Framework

The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the expected latency requirements for real

time processing? The solution may based on use case

if requirement is for immediate processing vs latency

of upto 5-10 minutes.

Duplicate Query. Refer to Sr.No. #195

593 17 9 Automatic recovery of data after failure/rejection

of record needs to happen without any manual

intervention

Is there any specific treatment that needs to be

performed for rejected records?

Duplicate Query. Refer to Sr.No. #199

594 17 14 Data Processing Framework

Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real time mode every day.

Are bulk of the data transformation jobs expected to

be triggered during non-business hours, when user

reporting or other workloads are at a minimum? Are

there going to be users across multiple time zones or

large part of the user base will be within a single time

zone?

Duplicate Query. Refer to Sr.No. #201

595 18 2 Migration from Existing Setup to Proposed Solution

Data migration from Staging and Data Marts, user

tables and any other schemas identified by Bank.

Along with the Staging and Data Mart objects, is there

any integrated data layer in the existing solution? If

yes, then is it built using any proprietary data model?

Duplicate Query. Refer to Sr.No. #206

596 29 7 Data Science Platform with AI/ML Capabilities

In-memory computing & integration with Spark,

Redis, etc

What kind of analytic processing is expected on Spark,

Redis, etc. ? Is this using Spark-ML, for example ?

Duplicate Query. Refer to Sr.No. #207

597 35 33 Hardware Specifications

Next-Gen DW should support at-least 500

concurrent users, scalable up to 1000 users in next

5 years, running ETL/ELT jobs or doing ad-hoc data

extraction requests on database (Not including API

based access or scheduled job connections to

database)

What is the nature and expected concurrency of API

based access ? What is nature and concurrency of

scheduled job connections - are these ETL, or

maintenance related connections ?

Will the 1000 concurrent users be expected to be

running queries simultaneously, or is this just 1000

concurrent logons ?

Duplicate Query. Refer to Sr.No. #211

598 36 37 Hardware Specifications

Ad-hoc jobs of any complexity should not hamper

the scheduled jobs performance.

What is expected mix of queries in terms of tactical

(very short), medium, long running (reports), batch

loads, near real-time and real-time loads running

simultaneously on the system ?

Duplicate Query. Refer to Sr.No. #212

599 19

Migration from Existing Setup to

Proposed Solution

Vendor to review the existing architecture during

migration and remove duplication of data and

recommend improvements in overall setup if any

What deduplication rules have been defined by bank Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

600 13 Scope of Work Team structure (without actual profiles) Does the bank envisage OEM involvement in

implementing core OEM related services via SI

Yes, please refer to point number # 16, under Hardware Specification on page

number # 34 in EOI for more details.

601 13 Critical Functional Requirements -

Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Please provide ratio of split of Structured: Semi-

Structured; Unstructured (Images): unstructured

(Videos) data. This helps in solutioning

Duplicate Query. Refer to Sr.No. #224

602 25, 39 Critical Functional Requirements -

Regulatory Reporting

Automation –Tool should automate analytics and

reporting workflow end-to-end, including all data

collection, enrichment, and management, as well as

all calculations, processes to final report

submission. Currently 500+ jobs are being used for

Tranche 1 DCT generation along with 500 more for

other regulatory reports/returns.

Please provide the number of returns/reports for

regulatory body over and above the ones listed in

Annexure E

Duplicate Query. Refer to Sr.No. #227

603 11 2 Eligiibility Criteria

Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

Does the word 'Next-GEN DW' solution means solution

comprising of Data Lake & Data WareHouse. Also does

the 'Next-GEN DW' solution refers procurement of

hardware & software application ?

Will the word 'Next -GEN DW' solution be part of the

overall contract to be signed ?

Word Next Gen DW refers to solution(s) fulfilling all the requirements given

by the Bank in this EOI. Purpose of EOI is clearly mentioned as -> Please note,

the objective of this Request for EOI is to identify all possible solution (s) for

the scope of work defined in this document.

604 14 3 Data Ingestion:

GUI based framework to configure sources to NEXT-

GEN DW

Please elaborate the expectation. Does the

configuration requires addition/modification

1. Add/Delete Source System

2. Add/Delete Source System Tables, Manual Files

3. Add/Delete Metadata Reconciliation information,

etc.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

605 14 8 Data Ingestion:

Existing ETL Jobs to be Fine Tuned. Re-runnablity

checkpoints should be present in ETL jobs. New ETL

jobs should be able to parallel read and write data.

Please confirm the timeframe for parallel run and trust

is parallel write of exisitng and new ETL will be on

different servers

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

606 18 18 Data Processing Framework

Transformations for this activity can be categorized

into the following types:

·     Existing transformations in DWH that needs to

be migrated to NEXT-GEN DW

·      New transformations for data sources that are

not sourced by DWH

·      Transformations and data processing pipelines

for real time data capture

Are there any realtime transformation happening in

current setup ? If yes, please share no. of real time

transformartion happening and also give an example

for the same.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

Page 34: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

607 18 1 Migration from Existing Setup to Proposed

Solution:

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

What is the current technology set-up ? How is the

reconciliation done and what is the degree of

correctness acheived in the current system?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

608 18 3 Migration from Existing Setup to Proposed

Solution:

Data migration from existing archival solution to

new one.

Currently, archival solution holds data for how many

years ? How many years of data bank is looking to

archive in new DWH ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

609 20 5 Data Archival and Backup :

Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Tape backups are taken at what frequency. Will the

same frequency persist for the New DWH ? This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

610 20 1 Cloud Integration and Migration :

NEXT-GEN DW should be able to consume data

from external cloud-based infrastructures.

Are there any restrictions on fetching data from

external clouds ? If yes, How will data consumption

happen - DB connectivity/Manual Files? Where is data

center located for external clouds ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the RBI

guidelines on the same

611 20 5 Cloud Integration and Migration :

In view of the intent to reduce the hardware

footprint (in future), the technical architecture of

NEXT-GEN DW solution should be flexible to

accommodate provisioning of NEXT-GEN DW on

cloud. The Bank understands that there can be

differences in services offered by cloud service

providers. The NEXT-GEN DW solution architecture

should be designed considering as-is infrastructure

availability in cloud.

Share details of the current cloud set-up/architecture This information is not required at this stage of EOI.

612 25 3 Regulatory Reporting:

Change Management –Tool for handling ongoing

change in regulation or business requirements

without the need for programming expertise. On

and average logic for 5% of jobs being changed

monthly. Data used for regulatory reporting

changes on any frequency like daily / weekly / bi-

weekly / monthly, etc

Which is the current tool used for Change

Management ?

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

613 26 11 Regulatory Reporting:

Pre-submission Review – Multiple report writers

should allow users to review reports in various

formats before submission, with the ability to drill

down and make manual adjustments where

necessary.

How is the current manual adjustment is done - at DB

Level or in Reports ?

This information is not required at this stage of EOI.

614 27 1 User Management:

Vendor should propose automated solution / tool

(s) of User Access Management (UAM) for

administration of giving access to individual users

within a system access to the tools they need at

the right time.

How are users currently accesing the DWH, using only

reports or have read access on the DB or are they

connecting to the DWH using some third party tool.

Please list down all the tools

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

615 28 2 Data Science Platform with AI/ML Capabilities:

Power data / objects to existing analytics models

built on proprietary tools (IBM SPSS). Migration of

such models to new solution

Total No. of Models to be Migrated ? This information is not required at this stage of EOI.

616 12 and 13 End state objectives End state objectives Kindly confirm if there will be separate RFP or tender

for hardware and software (Data Warehouse, Data

Marts, Data Lake, etc) or if there will be single RFP and

bidder would be required to propose the hardware as

per their respective sizings. We are referring to the Big

Data Lake RFP released last year (March / April 2018)

where separate RFP for hardware and software was

floated

This information is not required at this stage of EOI.

617 Page 32 and

Page 34

Hardware Specifications Hardware Specifications The EOI envisages use of commodity hardware (page

32) but also mentions uptime of 99.99 % (page 34).

Please note, the overall availability would be built into

the architecture design and the SLAs confirmed basis

the overall design. Kindly confirm if this is correct

understanding

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

618 Page 32 Hardware Specifications Not Available (we request for additional

qualification criteria)

Request SBI to restrict underlying hardware providers

to vendors with proven reference architectures for

setting up data warehouse / data mart / data lake kind

of solutions. The reference architectures should be

published on the vendors website. Request SBI to

further consider vendors with reference architectures

using last two generations of Intel CPUs.

No change

619 Page 32 Hardware Specifications Not Available (we request for additional

qualification criteria)

Request SBI to restrict hardware vendors to consider

the top  vendors in terms of market share (both

revenue and units). Reports published by

organizations such as Gartner and IDC could be used

to ascertain the top vendors

No change

Page 35: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

620 15 3.Data Storage A multi-temperature data management solution to

be proposed by vendor where data that is

frequently accessed on fast storage—hot

data—compared to less-frequently accessed data

stored on slightly slower storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should also be capable

automated storage tiering and seamless data

transfer between hot, warm and cold storage. Data

residing in any of these storage areas must be

seamlessly mixed / merged according to

requirements without impacting performance.

Request Bank to not make this clause of three tier as

mandatory as OEM may also prefer to give an all flash

solution or a two tier Architecture based on the

solution Performance requirement. Also please

suggest on the following 7 points :-

1. What are the possible applications or data analytics

engine customer is considering for AI/ML data

warehouse project? ( to check if we have ISV

partnership with the AI/ML stacks to be considered)

would this be real time analytics, cold or hot analytics

or a combination of both?

2. What is the approximate data size for migration

from old set up?

3. What are the applications and protocols involved in

data that needs to be migrated?

4. What would be average retention of data on active

tier?

5. What protocols need to be considered for the data

warehouse project?

6. What OS flavor servers would be involved for data

processing?

7. Is customer willing for ready stacks for AI we have in

our offerings? (AI ready stack, end to end ready

solution for AI/ML use cases with networking,

compute, storage and racks in partnership with Nvdia)

No change in standard clause of EOI

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

621 20 1.Data Archival and Backup Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Recommend Bank to consider On-Premise Cloud based

Archive solution instead of Tape Based traditional

method to bring down TCO and consider modern

technology concepts

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

622 39 Annexure D General Please provide the existing EDW techinical

architecture along with the tools used (ETL, DQ, DW,

BI and their versions) which will help us to integrate it

with proposed Data Lake, also provide the list of

downstream systems being used in the exisiting DWH.

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

623 12 Annexure B Migration from existing setup to proposed solution Does the bank intend to use the existing DW for

sometime post the migration activities , Or will it be a

complete Sun set after migration.

Bank will take a final call at appropriate time.

624 12 Annexure B Framework for regulatory reporting Does this include implementation of ADF (Automated

Data Flow).

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

625 13 Annexure B Does the bank have any tentative project timeline in

mind.

Participating Bidders are expected to propose high level timelines for this

project as clearly mentioned on page number # 13 of this EOI.

626 13 Annexure B Is the bank open to use completely Open Source

technologies and commodity hardware.

Please refer to EOI clauses for more details

627 13 Annexure B Performance Benchmarks Will there be any 3rd party involved to conduct the

performance benchmarks. Does the bank have any

performance parameters listed.

Bank may take a call to involve a 3rd party to conduct/test performance

benchmark in future.

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

628 39 Annexure D Apart from Reports/ETL , Are there any existing

stastical models to be migrated into the new setup.

Please refer to EOI

629 20 Annexure B Cloud Integration Does bank want entire Next Gen setup in

public/private cloud (if required)or just cloud

integration (e.g. To process external data and bring

only processed data to its Nextgen DW for analysis)

At present, as per the Bank's IS policy migrating/storing data in public cloud is

not permitted. However bidders may propose, as an alternative, use of cloud

(public cloud, private cloud, on-premise etc) in addition to the best integrated

proposed solution.

630 20 Annexure B Transfer out of NEXT-GEN DW to public cloud

should not be possible by all roles in NEXT-GEN

DW. All activities with data transfer from Public

cloud should be logged for audit and monitoring.

It is interpreted that data will only flow INTO the Next

gen DW from the public clouds and never move out of

the Next gen DW to any of the public sites. Is this

understanding correct?

At present, as per the Bank's IS policy migrating/storing data in public cloud is

not permitted. However bidders may propose, as an alternative, use of cloud

(public cloud, private cloud, on-premise etc) in addition to the best integrated

proposed solution.

631 20 Annexure B General How many Analytics users / Data Scientists will likely

to access the system? What is the concurrency ?

Please refer Hardware Specification subsection, point number #34 on page

#35

632 39 Annexure D ETL Please share the details of no of ETL jobs to be

migrated into NextGen DW, by complexity. Where :

1.Very Simple = 4 Transformations

2. Simple = 6 Transformations

3. Medium = 10 Transformations

4. Complex = 15 Transformations

5. Very Complex > 15 Transformations

This information is not required at this stage of EOI.

633 20 Annexure B General Is there any requirement for Post Go live Support. If

yes , then please specify the duration of the support

period.

Required information is clearly given in EOI on page number # 33, point

number 12.

634 20 Annexure B General For 'in memory processing' - what will be the use cases

,volume and duration of data to be considered ?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

635 13 Annexure B Performance Benchmarks Will the benchmarking be done with data encryption

and masking features.

Yes,

Bidder to provide details of performance benchmarking to enable us to take a

holistic and comprehensive view of the architecture in formulating next

course of action

636 20 Annexure B Data Masking Please explain what level of masking is expected?

Is data expected to be masked in Production and

stored in masked form? OR Is data in non-prod also

expected to be masked.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

637 13

specification

Please clarify what is meant by internal and external

specification.

Internal network specifications means within the different components of

Next-Gen DW and external network specification means network

requirements between Next-Gen DW and other applications in the Bank

638 34 15 The Vendor shall ensure all Installations &

Implementation to be done by OEM badged

resources only

Request the Bank to modify this to OEM/Bidder

resources.

No change in standard clause of EOI

Page 36: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

639 15 Architecture diagram Deployment plan - Vendor to

submit architecture diagram of entire setup with

network and security equipment required. Bank

may change it after vetting by Information Security

Dept and / or Enterprise architecture Dept. It will

be binding on vendor.

Will this setup be completely green field? Is the bidder

expected to supply the network and security

equipment?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

640 34 18 The proposed hardware is mission critical for the

proposed project and support of 24 X 7 with an

uptime of 99.99 % to be ensured by providing

support at PR, and DR site for a period of 5 years.

Is bidder expected to provide onsite resources for

every domain? Please clarify.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

641 35 22 The Hardware solution must be compatible to

integrate with various systems in the Bank

including but not limited to SOC, PIMS, NOC,

Command Centre, ITAM, Service Desk, ADS, and

SSO etc. at no extra cost. Vendor will have to give

appropriate support to the Bank during integration

with various components of IT environment.

Please clarify what integration is expected? Proposed solution by Bidder should able to work with existing tools

mentioned in the clause

642 24 Technical Criteria/Scope of Work Data protection, Data security, Data privacy Whether client is expecting PII and SPI need to be

masked and encrypted in proposed DWH

environment.Whether bidder can leverage client

existing Masking and Encryption mechanism

Yes, Bidders are expected to propose solution(s) for encryption and masking

of PII and SPI data.

643 24 Technical Criteria/Scope of Work Authentication and Identity Management - A

comprehensive identity and access management

system should be available for centralized

management of users and groups. It should be

possible to quickly create and revoke the identity of

a user or a service by simply deleting or disabling

the account in the directory. Multi-factor

authentication is desired as an additional layer of

security for user sign-in and transactions

Whether client is expecting Data WareHouse solution

need to be integerated with IDAM solution, Please

provide the no of Users count or bidder can leverage

client existing IDAM solution to be integerated with

DWH

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

644 25 Technical Criteria/Scope of Work Compliance to Global Standards - GDPR, BCBS239,

PCIDSS, DFRA and similar relevant standards

So whether client PII and SPI information is going to

travered from European territory to a centralized DWH

site in India. As per the GDPR standards

Organization should document personel data they

hold, where it came from and whom do they share it

with.

Yes, proposed solution(s) should be GDPR compliant.

645 25 Technical Criteria/Scope of Work Data Leakage - Security CIA parameters should be

achieved, and tools should be able to find and alert

on Data leakage

Whether client is expecting data leak prevention from

proposed Data Warehouse setup. Whether bidder can

leverage client existing DLP solution for securing the

data in rest and in motion.

Yes, Bidders are expected to propose solution(s) for DLP

646 25 Technical Criteria/Scope of Work Vendor should propose automated solution / tool

(s) of User Access Management (UAM) for

administration of giving access to individual users

within a system access to the tools they need at

the right time

Our understanding is proposed solution must be

integerated with PAM solution. Whether bidder can

leverage client existing PAM solution

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

647 General Please provide the additaional secuirty controls that

bidder need to proposed.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

648 33 Hardware Specifications General What are the level of details required in the Hardware

Specifications. Please specify OR share any format that

the bank may have.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

649 19 Disaster Recovery The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

Are there any requirements for RPO Bidders to provide their best RPO for solution (s) proposed.

650 33 Hardware Specifications Vendor must ensure that the proposed servers are

fault-resilient with the most comprehensive

features and functionalities to ensure maximum

system uptime.

Can you please specify the details for ACTIVE-PASSIVE

requirements for DC/DR

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

651 36 Hardware Specifications The Vendor also needs to provide the configuration

for setting up of functional DR along with DEV and

UAT for each and every component / application of

Next Gen DW ecosystem.

Can you please specify the exact scope/coverage of

"Functional DR"

Please refer to subsection Disaster Recovery on page no #19 of EOI document

652 51 Hardware Specifications Vendor to submit all back-to-back agreement

copies between Vendor and SI / OEM / Parent

company etc if any and tenure of the back-to-back

agreement should be same as selected Vendor’s

agreement with the Bank

Would request the bank to keep this requirement at

the RFP stage.

Please refer to Corrigendum.

653 11 Annexure A 2. Vendor should have existing Next-Gen Data

Warehouse solution as mentioned in the EOI

3. The solution should have been implemented in

at least 2 large scale organizations.

The Annexure B reference is too open, Pl help with the

refined proposition. Sample below e.g. :

• Implementation of Data warehouse in Government/

Banking/ Telecom including the below criteria:

- Integration of Source Systems

- Integration of Reference Dimension Data

- Reports / Downstream Data reports

- Creation of Daily/Monthly Aggregates

- Historical Data Migration

- Near Real Time Platform including Data Integration &

Hadoop.

- 5+ Petabyte of data warehouse

- Near Real Time Data Availability Platform

- Daily data ingest of 5+ TB

- 100 + Hadoop Nodes for processing the data

- The approximate value of the project should be 20 +

Cr (INR)

Please refer to Corrigendum.

Page 37: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

654 12 Annexure B EOI Proposal should include following items for

each proposed option

The Following Specifications for each of PROD, DEV,

UAT and DR environments;

Addition the Performance Testing and Pre-Prod

Environment should also be included

No change in the requirements of EOI

655 13 Annexure B - Critical Functional

Requirements

2 - Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

unstructured Data Should have use case identified, if

this is for storage only the solution may be different

from the Processing of the unstructured data.

Please refer to Annexure E for sample use cases

656 14 Annexure B - Critical Functional

Requirements

3 - GUI based framework to configure sources to

NEXT-GEN DW

can we pl elobarate this requirement Refer to Sr. No. #604

657 14 Annexure B - Critical Functional

Requirements

8 - Existing ETL Jobs to be Fine Tuned. Re-

runnablity checkpoints should be present in ETL

jobs. New ETL jobs should be able to parallel read

and write data.

What is the Current ETL Tool with Version for

Finetuning. Approach will be suggested accordingly

Currently we are using IBM Stack. Bidders are free to propose solution(s) in

the best interest of the Bank to meet the requirements given in the EOI.

658 14 Annexure B - Critical Functional

Requirements

Tools used for Data Ingestion should be platform

and database independent and should be

compatible to ingest and replicate data on parallel

processing

Will open source stack be allowed? Please refer to EOI for details

659 14 Annexure B - Critical Functional

Requirements

End objective for the data ingestion is to publish

the dashboards for end users or any job related to

reporting and analytics max by 8.00am on next

business day

What is the time of Close of Business processing?

Objective is to determine the window SI gets for

report availability at 8am.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

660 14 Annexure B - Critical Functional

Requirements

Trigger mechanisms in identifying any structural

changes at source

Will Source systems allow access to such logs? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

661 14 Annexure B - Critical Functional

Requirements

The vendor should be able to design solutions to

handle data volumes and complexity in source data

with decompression logic wherever required.

Source system should allow de-compression process

to run the same server to ensure service compatibility.

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

662 16 Annexure B - Critical Functional

Requirements

11- The storage system should be robust to handle

at least 1,50,000 concurrent queries (Select/DML)

by processing engines / ETL jobs / end users

scalable up to 6,00,000 concurrent queries in next 5

years (assuming parallelism of 100 degree).

What is the Sizing of the Existing Infrasturcture and

Data Storage Platform

Refer Annexure G for sizing. Currently we are using IBM Stack. Bidders are

free to propose solution(s) in the best interest of the Bank to meet the

requirements given in the EOI.

663 16 Annexure B - Critical Functional

Requirements

11- The storage system should be robust to handle

at least 1,50,000 concurrent queries (Select/DML)

by processing engines / ETL jobs / end users

scalable up to 6,00,000 concurrent queries in next 5

years (assuming parallelism of 100 degree).

What is the distribution of users for:

1) Within intranet - application/ report users

2) From internet

3) Sandbox

This information is not required at this stage of EOI.

664 16 Annexure B - Critical Functional

Requirements

12 - Downstream departments (data consumer) to

be given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases. Construction of this

separate Database snapshot and enabling this audit

trails must not cause any major systemic

issues/challenges in smooth functioning of primary

DB.

Does SBI Looking for Seprate DB Snapshot with

seprate Queue as well for the specific users within

Wharehouse

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

665 16 Annexure B - Critical Functional

Requirements

13 - It should be possible to project and view data

through multiple modes using the Storage on NEXT-

GEN DW. Varieties of GUIs should be available to

project or view the output generated through

analytic processes. For instance: The Bank may

decide to implement use-cases that project

transactions data as a graph data structure. The

Storage solution on NEXT-GEN DW should allow for

such projections.

graph data structure can be provided in RDBMS as well

as Other structures like Graph DB which stores the

data in XML/JSON. Has SBI have any specific thoughts

around the same

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

666 17 Annexure B - Critical Functional

Requirements

Framework should have mechanism to protect data

at rest and at motion from unauthorized user

access and amendments.

Will the solution need its own access provisionoing

framework or it can integrate with Bank's existing

authentication framework?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

667 17 Annexure B - Critical Functional

Requirements

Data Processing Framework This has Overlaps with the Data Ingestion, Any specific

Need to have this seprately?

Data ingestion belongs to data sourcing requirements and Data Processing

framework covers scope of internal data processing in new set up and data

extraction for downstream departments & users.

668 17 Annexure B - Critical Functional

Requirements

Migration from Existing Setup to Proposed Solution 1)Please list down the sources of structured/

unstructured/SemiStructured data and its volumetric

and growth. Also if external sources please list them

2)Are any of the data sources anticipated in near

future? For current implementation do we need to

estimate any future sources? Please confirm

3)Are there any requirement for data marts to be

created? If so please provide the functional area with

number of data marts required.

"4)""For the ODS/data warehouse process, Please

provide the following source details w.r.g to structure

data

a. List/Number/Types of databases,

b. List/Number of physical tables

c. List of source Files for integration,

d. One time/daily load Volumetric and data growth

e. Batch processing latency etc."""

"5)SI assume that all the source system applications

will be able to provide the historical data from sources

in readable format

to be loaded into the DWH as files, please confirm (for

structured data)"

"6)""If previous assumption is incorrect then:

1. Sufficient information required at this stage for sources, sizing is given in

EOI

2. Please refer to Annexure C for Sizing of Data for more details

3. This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

4. This information is not required at this stage of EOI.

5. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

6. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

7. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

8. Please refer to sub section Data Ingestion on page number # 13 for details.

9. This information is not required at this stage of EOI.

10. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI. Bank will take a final call at

appropriate time.

11. Refer to Annexure C. Bidders are free to propose solution(s) in the best

interest of the Bank to meet the requirements given in the EOI.

12. This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

13. Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

14)Refer to Annexure C, D in EOI

15)This information is not required at this stage of EOI.

"16)""This information is not required at this stage of EOI. Bidders are free to

Page 38: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

669 18 Annexure B - Critical Functional

Requirements

Data Federation/Virtualization -3

Semantic integration of structured & unstructured

Data.

Kindly help us with the Use case for the unstructured

Data within Federation layer, this will help us in

putting the pointed solution

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

670 18 Annexure B - Critical Functional

Requirements

All data should be made discoverable and

integrable easily through a single virtual layer

which will expose redundancy and quality issues

faster.

Are we referring to only reportable data or data from

all layers of warehouse?

Data from all the layers of Next-Gen DW and all source systems

671 18 Annexure B - Critical Functional

Requirements

Migration from Existing Setup to Proposed

Solution, 2 - Data migration from Staging and Data

Marts, user tables and any other schemas identified

by Bank.

Do we have the Detailed data as Migration will have

challenges in case the Detalied data is not available

Detail current landscape picture with platform and

applications should be provided for an accurate

solution

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

672 19 Annexure B - Critical Functional

Requirements

8 - Migration of Data Governance, Data Lineage and

Data Quality rules and policies

What is the Status of the Current Data Lineage as this

is a critical and important area to know and to Migrate

Does SBI have accurate and detail documentation of

current implementation?

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

673 20 Annexure B - Critical Functional

Requirements

All the applications connected to the non-archived

data should be available with archived as well

Please provide use cases for this reuqirement to

narrow down on type of data access required from

archival data

Access to archival solution is expected to be similar to production setup

674 21 Annexure B - Critical Functional

Requirements

10 - Threshold control to kill the high resource

consumption query

Should not be part of the Monitoring Dashboard part Monitoring dashboard should showcase high resource queries killed due to

crossing threshold limits

675 21 Annexure B - Critical Functional

Requirements

24 - Data Reconciliation status for every data

movement on real time basis

Data Reconciliation should be real time for specific

cases not all

No change in standard clause of EOI. Bidders are free to propose solution(s)

in the best interest of the Bank to meet the requirements given in the EOI.

676 23 Annexure B - Critical Functional

Requirements

10 - Identity resolution - Identity resolution is the

process of linking various records and is the main

engine for record de-duplication, which can enable

some aspects of data cleansing.

Is SBI Looking for Identity resolution system or Full

fleged MDM?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

677 23 Data Quality 1)Provide the attributes count for each of the source

system that needs DQ process

2)Approx. how many master attributes are considered

for data quality processes?

3)Records from how many sources and what source

systems needs to go through the data quality

processes?

4)Any process used at present to assess the

authenticity of records after the DQ process is run?

Who are the people (their profiles - data stewards

etc.) involved in the same?

5)Please indicate, Volume of records on which the

current Data Quality processes are carried out? What

is the % of record increments per week/month?

6)Any data enrichment processes being used at

present? If yes, what are they?

7)Please specify the list of Country/Countries for

which we require address verification and

standardization. Please list the country (ies) of origin

for the name and address data to be processed.

8)Is data quality going to be a onetime data cleansing

effort, or are we required to do this periodically?

9)Is there a requirement for a real-time/near real-

time or post facto data cleansing?

10)Please indicate type of data enrichment sources

that should be used, whether any other third-party

tool to be used.

11)Are there serious data gaps in today's scenario that

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

678 24 Annexure B - Critical Functional

Requirements

15 - Each month 5 data quality use cases will be

developed and implemented on Next Gen DW.

Examples of use cases are as given below;

- Profiling of Customer Master table for verifying

PAN, Mobile Number, Date of Birth, Address, Pin

Code, etc

- Profiling of Branch Master for verifying branch

address, contact information, branch manager/staff

information, etc

are we looking for the Validation and Enrichment from

the Open Government API's like MCA21, UIDAI, NSDL?

Do we have agreement with the Government

agencies?

Also what is the suggestion on Paid data sources?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

679 27 Annexure B - Critical Functional

Requirements

Data Masking What is the Plan for the Data Masking - is this Dynamic

Data Masking or Persistenet Data Masking for the data

porting to non prod environments

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

680 28 Annexure B - Critical Functional

Requirements

8 -Real time reporting can be done through Staging

area of Data Warehouse

Do we want to have all the Data In staging as it is been

done in AS-IS or can bidder suggest other options

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

681 28 Annexure B - Critical Functional

Requirements

Implementing end to end analytics use-cases as

mandated by the Bank

Has any AIML use cases already implemented?

Howmany such cases?

This information is not required at this stage of EOI.

682 29 Annexure B - Critical Functional

Requirements

Data Science Platform with AI/ML Capabilities

Vendor to provide solution / tool (s) for below

scope of activities on SBI data sets;

- Benchmarking

- Predictive & Prescriptive Analytics

- Social Media Analytics

- Web Analytics

- Geolocation Analysis

- Ad-Hoc Analysis

- Trend Indicators

- Profit Analysis

- In-Memory Analysis

- Statistic Analytics

- Data Mining

Do we have any quantification of the models need to

be created as part of the project

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

Page 39: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

683 30 Business Intelligence Tools 1)Please provide the source systems details for BI &

Analytics and list the formats of data.

2)Please share the volumetric details of source

systems

3)Please share details for existing data quality and

consistancy across data sources

4)"Do you envisage any need for unstructured data in

future

e.g- Text, XML files, Audit log, files, pictures, social

media."

5)Please provide the number of users to be

provisioned for analytics and business intelligence

tools users from a Licensing standpoint and YOY

growth expected on the same.

6)While proposing licenses, do we go for Perpetual

licenses which would help beyond implementation

and support or subscription based licenses.

7)Please provide Data Retention policies in Target

System. Please specify the time period for the Storage

of on Online Data as well as Archival Data

8)How many years of data considered for history load

and data analysis

9)Please let us know the Number of Reports and

Dashboards expected with complexity level (I.e.

Simple, Medium & Complex) ?

10)How many metadata layers are required for

reporting ?

11)Is Multilingual reporting expected ?

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

684 31 Annexure B - Critical Functional

Requirements

Mobile version: BI tools should be able to

differentiate between viewing BI applications on a

web browser on a mobile device versus a mobile BI

application.

Does bank need reports to be integrated with its

current mobile platform and application orstandard

mobility apps of BI OEMs can be used?

Yes

685 36 Annexure B - Critical Functional

Requirements

36 - NEXT-GEN DW is expected to have more users

and the solution should not be bound by any

license model for number of users

Not all products support Core based Model, pl suggest This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

686 26 Annexure B - Critical Functional

Requirements

High Volume, High Performance and Reliability –

Scalable and resilient architecture which will handle

all volume and performance demands.

Does regulatory requirement need archival data access

aswell?

Yes. It should have the capability to fetch data from Archival solution

687 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on

BI/ETL/Database Level. User concurreny at

BI/ETL/Database. Also what is type/make/core

information of ETL/DWH/ODS/DM Servers.

Duplicate Query. Refer to Sr.No. #280

688 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please

provide backup and other configuration details .

Existing backup are disk based /tape based/frequency

of backups

Duplicate Query. Refer to Sr.No. #281

689 19 Disaster Recovery

clause 1

Bank proposes to setup only functional DR to start

with. At later stage Bank may take decision to

setup full scale 100% DR.

Please elaborate on functional DR in terms of PROD

capacity . Is DR being looked from Day 1?

Duplicate Query. Refer to Sr.No. #282

690 19 DR Clause 4 The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

Since the volumes involved are large , the bandwidth

capacity and the time slots for replication will be

provided by the Bank. Please clarify

Duplicate Query. Refer to Sr.No. #283

691 19 DR Clause 6 The proposed solution is expected to have a

monitoring engine that can determine the health of

production NEXT-GEN DW and raise alerts / trigger

remedial actions to bring NEXT-GEN DW – DR as

the default NEXT-GEN DW

Bank wishes to have an automation tool for the same .

Please clarify.

Duplicate Query. Refer to Sr.No. #284

692 32 HW Specs clause 10 Vendor must provide detailed configuration of the

proposed Hardware, including Hosting Space

Requirements, Racks, Power, Cooling and any other

requirement for the fulfillment of the Vendor’s

obligation in this EOI.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

Duplicate Query. Refer to Sr.No. #285

693 35 HW Specs Clause 30 Vendor is required to provide the minimum

resources to monitor & manage the infrastructure,

however it is the Vendor’s responsibility to right

size the resources to meet the SLA

Clarification needed on Bank's expectations on

number of resources. Can you please explicitly

mention the SLA requirements.

Duplicate Query. Refer to Sr.No. #286

694 36 HW Specs Clause 44 Vendor need to propose a solution for data

migration / transfer between Existing DWH (Navi

Mumbai Location 1) and NEXT-GEN DW-PR (Navi

Mumbai Location) and also between NEXT-GEN DW-

PR (Navi Mumbai Location 2) and Hyderabad (DR)

or any other places for PR and DR decided by the

Bank.

Please share details of locations , approx distances ,

bandwidth capacity to be provided by Bank. Please

clarify.

Duplicate Query. Refer to Sr.No. #287

695 36 HW Specs Clause 48 The vendor should provide EXACT size needed for

production in the 1st year and estimated sizes for

consecutive years keeping in view the growth rate

predicted by Bank in this section and provide

empirical evidence for the calculation of growth

rate.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

Duplicate Query. Refer to Sr.No. #288

696 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of

existing EDW & 2. Breakup of tiers among the

80+ production servers.

Duplicate Query. Refer to Sr.No. #289

697 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Duplicate Query. Refer to Sr.No. #290

698 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA

implementation for these use cases . Alternatively willl

the OFSAA feed from the DWH. Please let us know the

interplay between DWH and OFSAA.

Duplicate Query. Refer to Sr.No. #291

699 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a

complement to existing data solutions at the

bank (e.g. OFSAA) ?

Duplicate Query. Refer to Sr.No. #292

700 40 Annexure E Functional Use Cases - role  of Next-Gen-DW in mission critical

operational processes e.g. daily RBI and regulator

reporting, related time financial crime detection,

etc.

Duplicate Query. Refer to Sr.No. #293

Page 40: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

701 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH

Solution. Information including but not limited to:

DWH platform, model, version and Size

Details of CPUs, Memory, OS, Database version

Details of Storage configuration: Size, capacity, free

and used space etc

Whether there are any physical or logical isolation of

DWH setup

HA, Back, DR details

Currently we are using IBM Stack. Bidders are expected to propose high level

migration plan.

702 16 4 Storage replication (e.g. RAID) should be

automatically managed by the platform.

We believe this is a RAID level if not then please

elaborate more

Duplicate Query. Refer to Sr.No. #296

703 16 8 Storage should support data compression. It should

be possible to perform both fast compression and

efficient compression based on data processing

needs.

We recommended Bank to additionally ask for Storage

and DWH systme capable of providing Coumnar based

as well row based compression. And data should be

readable without decompression

Duplicate Query. Refer to Sr.No. #297

704 16 9 The storage should be horizontally and vertically

scalable. Redistribution of data across the NEXT-

GEN DW should be possible automatically and

seamlessly.

We recommended Bank to additionally ask for

"Storage upgrade and cpacity increase should be done

without dontime

Duplicate Query. Refer to Sr.No. #298

705 16 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

1. Please provide details of queries and ETL jobs

2. Please provide ratio of Select/DML queries and ETL

jobs

3. Share the complexity of query- simple, medium,

complex Queries

4. Also provide Query defination of Simple, Medium

and Complex queries

Duplicate Query. Refer to Sr.No. #299

706 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases. Construction of this

separate Database snapshot and enabling this audit

trails must not cause any major systemic

issues/challenges in smooth functioning of primary

DB.

Please provide rationale and logic to have separate

processing and storage for this. Powerful DHW

systems are today capable of servicing all consumer

groups in parallel

Duplicate Query. Refer to Sr.No. #300

707 17 12 ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Please clarify further with example Duplicate Query. Refer to Sr.No. #301

708 17 13 Existing reports and extracts generation jobs on

DWH should be analyzed and transformed to the

NEXT-GEN DW. The vendor should use preferably

off-the- shelf tools and not resort to building from

scratch.

Please share sample reports and extracts to propose

most suitable options for migration

Duplicate Query. Refer to Sr.No. #302

709 17 Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real

plz. Define complexity - simple, medium, complex

Share the split % of simple, medium, complex

Plz. Share sample jobs for each type

Duplicate Query. Refer to Sr.No. #303

710 27 User Management: Pt4-The access privileges

associated with each system product, e.g.

operating system, network, database, application

and system utilities, and the users to which these

privileges need to be allocated should be clearly

identified and documented.

Should we assume that the access privileges are to be

assigned to the users directly and managing access to

these privileged accounts is not required?

Duplicate Query. Refer to Sr.No. #304

711 39 User database of 30000+ officials Should we assume approx 30k users would access the

solution with a YoY increase of 10%?

Duplicate Query. Refer to Sr.No. #305

712 30 6 Reporting on all types of available of Data Formats;

· Structured, semi-structured, unstructured

· Click stream data

· Audit Logs

· Documents

· Multimedia data (Images/Videos/Audios)

· XBRL format

· IRIS iFILE framework

Please clarify on the type of database used for the

each of data formats asked for. Or Is it safe to assume

the underline data will be in as per industry specified

relational data format like Oracle , DB2 etc.

Duplicate Query. Refer to Sr.No. #306

713 30 5 Visualizations: BI tools must provide below

different types of visualizations;

· Animations, Barcodes

· Bar, line, pie, area and radar chart types

· Tables, Graphs, Infographics, Filters

· Widgets

· Drag and Drop Creation, Customization

· Templates

· Freehand SQL Command

· Geospatial Integration

· Layouts

· Themes

· Ability to mix and match various combinations

Please elaborate on the definition of

-Animations

-Infographic :what all visualization are you referring to

-Widgets, Templates

Duplicate Query. Refer to Sr.No. #307

714 32 22 In-memory analytics: The product should pull data

into an in-memory or locally cached data store

preferably columnar is an increasingly popular

feature that enables very fast analytics once the

data is loaded.

To achieve the fast analytics, BI tool may adopt

different architecture. BI tool can easily leverage the

in-memory benefits of underline database without

pulling and creating data redundancy and henceforth

reducing the data manageability at BI Layer . Request

you to rephrase the this point as

"In-memory analytics: The product should pull data

into an in-memory or leverage the In-Memory

capabilities of underline database Or locally cached

data store preferably columnar is an increasingly

popular feature that enables very fast analytics once

the data is loaded."

Duplicate Query. Refer to Sr.No. #308

Page 41: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

715 32 23 Offline updates: BI tools, when storing copies of

the source data in an online analytical processing

(OLAP) cube or in-memory columnar data store,

should enable business users to schedule

automatic data updates.

Different BI tools have different architecture. BI tool

can easily leverage the capabilities of underline

database. Is this point relevant to the BI Tools whose

architecture is to store the data with BI Server.

Duplicate Query. Refer to Sr.No. #309

716 32 28 Speed of access: Query performance will vary based

on the complexity of the queries and the amount of

data involved. Dashboards with multiple

visualizations will need to get query results from

many queries. The best practice is to create several

prebuilt query scenarios and compare how each

product performs based on these specific

examples. The worse practice is to just arbitrarily

rate the speed.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Duplicate Query. Refer to Sr.No. #310

717 32 29 The best practice is to establish a testing

environment to determine scalability in terms of

both the number of concurrent users and data

metrics, such as volumes, variety and veracity.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Duplicate Query. Refer to Sr.No. #311

718 32 32 Ability to handle and summarize huge volumes of

data. E.g. 30-40 million rows accessed on index and

summarized over 5 to 8 metrics.

Please elaborate the use case for consumption of 30-

40 million rows from BI. Usually BI tool leverages the

underline database to do summarization of data and

only works on the resulted dataset

Duplicate Query. Refer to Sr.No. #312

719 35 36 The web portal of Business Intelligence tool should

support at-least 25000

concurrent users, scalable up to 75000 in next 5

years, accessing various reports generated

For doing the sizing of Business Intelligence we need

bifurcation of the concurrent users (25000)

Total Concurrent Users : 25000

Number of concurrent active : provide the concurrent

active user count

Number of logged/in-active : provide the

loggedin/active user count

Out of Active Users:

- Users executing BIEE dashboards (having 4/5 reports

or simple charts in a dashboard)

- Users executing large Pivot table operations (25000+

rows)

- Users executing (small to medium sized report - 50K

cells or lower) export to pdf/XL operations

- Users executing very heavy Graphics

Number of Active Concurrent running Extra Large

Reports

(Usually Extra Large Reports are executed off-line

hours)

Duplicate Query. Refer to Sr.No. #313

720 15 point # 16 Proposed solution should be able to scrap

encrypted log, capture Metadata

changes at source level completely, scrapping 4000-

5000 logs daily having log

size of ~ 2 TB each scalable up to 10000 logs.

Proposed solution should be

capable of scrapping logs generated by any type of

Database. E.g. Oracle

Database, IBM DB2 Database etc.

Kindly specify if decryption of logs is also required or

only storage of such logs is fine ? If yes, is there

decryption logic available in specified system ?

Duplicate Query. Refer to Sr.No. #314

721 38 Annexure C -Monthly Data processed

in DWH Warehouse

Archived log extract CBS (SBI) +

TF (SBI)

Are these logs encrypted?

Do we need to keep the RAW logs into the system? Or

only processed logs ?

Duplicate Query. Refer to Sr.No. #315

722 29 # 9 (Data Science Platform with

AI/ML Capabilities)

GPUs to be incorporated in solution if possible

using HDFS Hadoop like

environment for better analytical results

1. Is there requirement to run AI/ML models within

HDFS Hadoop ? Or Expectation is to pull the data into

GPU based analytics workbench and then process.

2. Running AI/ML models within Hadoop is also faster

and Having separate GPU based system for specific AI

models can reduce the cost of GPU based solution.

Please suggest

Duplicate Query. Refer to Sr.No. #316

723 15 # 1 - Data Storage A multi-temperature data management solution to

be proposed by vendor where

data that is frequently accessed on fast

storage—hot data—compared to lessfrequently

accessed data stored on slightly slower

storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should

also be capable automated storage tiering and

seamless data transfer between

hot, warm and cold storage. Data residing in any of

these storage areas must be

seamlessly mixed / merged according to

requirements without impacting

performance.

Kindly share the tentative timeline for Hot/Warm/Cold

data so that we could calculate the size. Example: Hot

Data - 6 months, Warm Data - 1 year, Cold data > 1

year etc.

Duplicate Query. Refer to Sr.No. #317

724 18 18 Transformations for this activity can be categorized

into the following types:

· Existing transformations in DWH that needs to be

migrated to NEXT-GEN DW

Please share the existing transformation details. Duplicate Query. Refer to Sr.No. #318

725 20 1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Please provide Retention period Duplicate Query. Refer to Sr.No. #319

726 20 5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Please provide the details of existing Tape Backup

Solution, Backup Window, Backup Throughput and

Restoration throughput

Duplicate Query. Refer to Sr.No. #320

Page 42: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

727 13 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Ingestion Sr #2

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Which RDBMS source data is required to be extracted

in real-time mode? Please provide the source system

name, RDMS type (Oracle/ SQLServer etc) and the

underlying OS

Duplicate Query. Refer to Sr.No. #321

728 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #5

Migration of existing data extraction and reporting

jobs.

Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #322

729 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #6

Migration of monitoring dashboard data points. Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #323

730 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #7

Migration of user details. Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #324

731 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #8

Migration of Data Governance, Data Lineage and

Data Quality rules and policies

Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #325

732 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Will these also be flat files? If not, what will be the

interface mode (RDBMS, webServices, API etc) and

which RDBMS?

Duplicate Query. Refer to Sr.No. #326

733 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Please provide a count of data sources (best

approximation, and of these how mant will be flat file

sources?

Duplicate Query. Refer to Sr.No. #327

734 23 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Quality Sr #12

Mechanism to capture feedback from end users to

report Data Quality issues

Please elaborate. Can this be implemented using

enterprise collaboration tooling / ticket maintenance

system?

Duplicate Query. Refer to Sr.No. #328

735 39 Annexure D - Existing Data

Warehouse Architecture Sr #14

Data Quality What are the existing Data Quality details? How many

and which entity masters are maintained? What is the

current count of each type of Entity and how are their

counts expected to scale up (volumetrics)?

This information is not required at this stage of EOI. Bidders are free to

propose solution(s) in the best interest of the Bank to meet the requirements

given in the EOI.

736 35 24 Vendor needs to provide Helpdesk support 24X7 to

the Bank for end to end support for hardware

maintenance.

Please confirm Helpdesk services needs to be onsite or

from Wipro's remote delivery location SNXT, Mysore.

Onsite only

737 Please mention the SLA (resolution, response time,

etc.) for Infrastructure Operation and Maintenance

(L1, L2, L3) along with Priority definitions of Incidents

and Service Requests.

This information is not required at this stage of EOI.

738 Please confirm ITSM tool /Helpdesk tool will be

provided by SBI.Also pls name the ITSM tool present in

the environment.

Bidder will have to propose required tool. Bank will take a call on it at

appropriate time.

739 Please confirm ITAM tool /Asset management tool

will be provided by SBI.Also pls name the ITAM tool

present in the environment.

Bidder will have to propose required tool. Bank will take a call on it at

appropriate time.

740 Please confirm the EMS Tool/ Infrastructure

monitoring tool present in SBI's environment. Also

please confirm Wipro can leverage the same for

monitoring.

Bidder will have to propose required tool. Bank will take a call on it at

appropriate time.

741 please confirm if patch management tool is available

in the environment and the same will be extended to

Wipro .

Bidder will have to propose required tool. Bank will take a call on it at

appropriate time.

742 SI assume DC and DR being on Cloud, there is no need

for Hands and Feet Support. However Please confirm

if SI needs to provide "Hands and Feet Support" for

any DC/ DR hardware. If yes, please provide the

location (pin code) and service window.

This information is not required at this stage of EOI.

743 Please confirm on Back up software available in the

environment and the same will be extended to Wipro

Bidder will have to propose required tool. Bank will take a call on it at

appropriate time.

744 Please confirm the DC/DR drill frequency This information is not required at this stage of EOI.

745 27 1 Please confirm any BCP tools/solution present with SBI

and the same can be extended to Wipro

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

746 13

specification

Please clarify what is meant by internal and external

specification.

Duplicate Query. Refer to Sr.No. #637

747 34 15 The Vendor shall ensure all Installations &

Implementation to be done by OEM badged

resources only

Request the Bank to modify this to OEM/Bidder

resources.

Duplicate Query. Refer to Sr.No. #638

748 15 Architecture diagram Deployment plan - Vendor to

submit architecture diagram of entire setup with

network and security equipment required. Bank

may change it after vetting by Information Security

Dept and / or Enterprise architecture Dept. It will

be binding on vendor.

Will this setup be completely green field? Is the bidder

expected to supply the network and security

equipment?

Duplicate Query. Refer to Sr.No. #639

749 34 18 The proposed hardware is mission critical for the

proposed project and support of 24 X 7 with an

uptime of 99.99 % to be ensured by providing

support at PR, and DR site for a period of 5 years.

Is bidder expected to provide onsite resources for

every domain? Please clarify.

Duplicate Query. Refer to Sr.No. #640

750 35 22 The Hardware solution must be compatible to

integrate with various systems in the Bank

including but not limited to SOC, PIMS, NOC,

Command Centre, ITAM, Service Desk, ADS, and

SSO etc. at no extra cost. Vendor will have to give

appropriate support to the Bank during integration

with various components of IT environment.

Please clarify what integration is expected? Duplicate Query. Refer to Sr.No. #641

751 4 Background Data Insights / Patterns/ Real-Time The Elasticsearch Functionalities required for

addressing it is not clearly mentioned.

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

Page 43: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

752 12 End State Objectives AI/ML Capabilities & Real-Time Analytics. Log

Analytics

The latest ElasticSearch Functionalities to address the

requirement is not coming out clearly in the document

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

753 22 Data Dictionary Search

Audit Trail/Log

Request to specify in detail

This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

754 26 Audit & Log Management Audit & Log Mgmt Are Latest Features n Functionality for Audit & Log

Analytics required?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

755 40-43 Functional Use Cases Customer / Product / Fraud Use Cases Are he advanced capabilities for searching & analyzing

data to detect anomaly, patterns, fraud, segmentation

etc required?

Yes, Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

756 NA General Query Security & Cyber Insurance Are Log Analytics, Metrics, Security Analytics required

as part of proposed solution?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

757 NA General Query Enterprise Search Is Enterprise & Application Search capability required

as part of the proposed solution?

Bidders are free to propose solution(s) in the best interest of the Bank to

meet the requirements given in the EOI.

758 11 Annexure-I Eligibility Criteria New Point. EoI is silent on Consortium bidding Whether EoI can be submitted by Consortium of 2

companies? If not, can a bidding agency sub-contract

few niche components at later stage?

The bidder will be single organization and may have arrangements with

various other organizations at the back-end. The nature of the arrangement

at the back-end will be decided by the Bank at appropriate time.

759 New Point Whether the Bank has chosen any Technology Stack ? This is a solution discovery phase. Hence we are asking for best possible

technologies to give best performance in proposed solution. Bidders are free

to propose suitable solution to meet the requirements of this EOI.

760 New Point Whether the Bank is open to "Open Source Software

tools and Platforms?

Please refer to EOI clauses for more details

761 New Point Should the Next Gen DWH solution be deployed on

premise or cloud or Hybrid?

Public and private clouds deployed in non-SBI data centers are not required

as part of solution.

762 New Point During RFP Stage, SBI is requested to keep a turnover

limit where-in niche product/ solution companies like

Posidex would be able to bid. They should also permit

either consortium bidding or allow sub-contracting of

few specialised tasks viz. entity resolution/ creation of

Golden Record/ use of block-chain etc

This information is not required at this stage of EOI.

763 13 2 Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Is the NOSQL data mentioned here JSON/XML ? What

kind of processing on NOSQL is expected ? Is there a

need to join the NOSQL data with other relational data

? Is there a need to shred the NOSQL data into

relational data ?

Duplicate Query. Refer to Sr.No. #189

764 14 11 Data Ingestion

One of the most important feature is the richness

of the transformations to do day-to-day tasks, such

as;

Data conversion, lookup, expression, joining

records, splitting data, filtering, ranking, sorting,

grouping, looping, and combining data,

pivot/unpivot, converting dates, setting variables

based on parameter files, merging rows, finding the

latest file, and splitting data based on certain

conditions, running web methods, transforming

XML documents, rebuilding indexes, sending

emails, profiling data, handling arrays and records,

processing unstructured data, masking, monitoring

the inbound data flow for completeness,

consistency and accuracy, wizards to assist creating

complex packages, like loading fact tables, or type

two slowly changing dimensions (SCD – T2)

Are any tools / licenses already available with SBI for

ingesting data used in the existing DW solution?

Please provide a list. Any existing tools that align with

the new solution can be considered for reuse if found

to be a good fit.

Duplicate Query. Refer to Sr.No. #190

765 15 1 Data Storage

Vendor should propose effective number of data

storage layers in NEXT-GEN DW between data

ingestion and data consumption.

In the existing DW solution, are surrogate keys used?

If yes, is there a framework for storage and

management of the keys to ensure robustness of the

data warehouse?

Duplicate Query. Refer to Sr.No. #191

766 16 11 Data Storage

The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

Will all these queries be executing in-flight at the

same time ? Or will these be initiated over a period of

time and the aggregate number of queries run during

that time period will be 600,000 ?

Duplicate Query. Refer to Sr.No. #192

767 16 13 It should be possible to project and view data

through multiple modes using the Storage on NEXT-

GEN DW. Varieties of GUIs should be available to

project or view the output generated through

analytic processes. For instance: The Bank may

decide to implement use-cases that project

transactions data as a graph data structure. The

Storage solution on NEXT-GEN DW should allow for

such projections.

Do we need to enable graph based analytics as well or

it should be only limited to the facility to access the

data for graph based analytics.

Duplicate Query. Refer to Sr.No. #193

768 17 1 For the data to be accessible and consumable by

businesses / downstream applications, the NEXT-

GEN DW should have robust, highly efficient and

parallel execution of data transformation jobs.

We can enable JDBC/ODBC or REST API based access.

Will there be any specific mechanism to connect like

specific type of drivers needed for connectivity with

SAS system etc.

Duplicate Query. Refer to Sr.No. #194

769 17 2 Data Processing Framework

The NEXT-GEN DW ecosystem should have state of

the art data processing engines that can perform in-

memory processing to reduce the time for data

transformations and query in case of real time

requirements.

What are the expected latency requirements for real

time processing? The solution may based on use case

if requirement is for immediate processing vs latency

of upto 5-10 minutes.

Duplicate Query. Refer to Sr.No. #195

Page 44: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

770 17 3 Framework should allow joining multiple

sources/tables/inputs etc.

The source here means the data ingested from the

multiple source systems and present on NEXT-GEN

DW platform not with the data at the source systems.

Duplicate Query. Refer to Sr.No. #196

771 17 5 Framework should be capable of performing

validation checks pre-and post- processing.

What will be the outcome of validation? Duplicate Query. Refer to Sr.No. #197

772 17 8 Data Processing Framework

Should have audit and error logs for auditing and

troubleshooting

Is there is requirement to maintain row-level

traceability of the data records, i.e. from the data

consumption layer backwards up to the originating

source file for a particular record?

Duplicate Query. Refer to Sr.No. #198

773 17 9 Automatic recovery of data after failure/rejection

of record needs to happen without any manual

intervention

Is there any specific treatment that needs to be

performed for rejected records?

Duplicate Query. Refer to Sr.No. #199

774 17 11 Framework should have mechanism to protect data

at rest and at motion from unauthorized user

access and amendments.

Is it required to encrypt data at rest and in motion? Is

there a data masking tool available with the bank. Is

there a need for data masking

Duplicate Query. Refer to Sr.No. #200

775 17 14 Data Processing Framework

Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real time mode every day.

Are bulk of the data transformation jobs expected to

be triggered during non-business hours, when user

reporting or other workloads are at a minimum? Are

there going to be users across multiple timezones or

large part of the user base will be within a single

timezone?

Duplicate Query. Refer to Sr.No. #201

776 17 17  The processing pipelines for ETL/ELT jobs also

include real time, daily, weekly, monthly, quarterly

and annual reports, feeding data structures for

downstream consumption. These activities are in-

scope for this engagement.

How many such reports are there and what is the data

model for them this is to estimate the number of ETL

required for the end system.

Duplicate Query. Refer to Sr.No. #202

777 17 19 The workflows should work with standard

schedulers. Monitoring and management of

workflows should be possible from an easy to use

interface. Workflow management tool(s) should

have connectors / pluggable interfaces to already

existing / in-use proprietary software available with

the Bank. These could be (and not restricted to)

data repositories, reporting tools, data analysis

tools and generic interfaces for data transfer.

Scheduled jobs status should be made available to

the Bank in Monitoring dashboard on real time

basis.

What are the supported mechanism with proprietary

tool? Does that tool support REST based integration?

Which scheduling and monitoring tool does bank

have?

Duplicate Query. Refer to Sr.No. #203

778 18 1 Migration from Existing Setup to Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

Integration of real time data with the data on the

NEXT GEN DW is possible but does this requirement

mean to integrate with multiple source systems? Do

we get the access to the source systems directly.

Duplicate Query. Refer to Sr.No. #204

779 18 1 Migration from Existing Setup to Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

What is the defined reconciliation mechanism is it a

point in time based? Because all the system will be at

have different data based on the time a execution of

ETL job frequency.

Duplicate Query. Refer to Sr.No. #205

780 18 2 Migration from Existing Setup to Proposed Solution

Data migration from Staging and Data Marts, user

tables and any other schemas identified by Bank.

Along with the Staging and Data Mart objects, is there

any integrated data layer in the existing solution? If

yes, then is it built using any proprietary data model?

Duplicate Query. Refer to Sr.No. #206

781 29 7 Data Science Platform with AI/ML Capabilities

In-memory computing & integration with Spark,

Redis, etc

What kind of analytic processing is expected on Spark,

Redis, etc. ? Is this using Spark-ML, for example ?

Duplicate Query. Refer to Sr.No. #207

782 29 18 Data Science Platform with AI/ML Capabilities

All machine-learning platforms either support

multiple models out of the box or provide an

option to custom-code the same

What kind of use cases for ML are expected so as to

understand need for existing out of the box vs. custom

solutions ?

Duplicate Query. Refer to Sr.No. #208

783 29 19 Data Science Platform with AI/ML Capabilities

Integration with R, Python, Keras,

Tensorflow,Theano, scikit-learn etc and other

frameworks / languages

Is there a need to connect with any other analytic

engines to be run in an parallelized/distributed

manner on the system ?

Duplicate Query. Refer to Sr.No. #209

784 29 24 Data Science Platform with AI/ML Capabilities

Annexure E gives sample use cases which are to be

implemented on Next Gen Data Warehouse using

structured and/or unstructured and/or semi-

structured and/or any other kind of data gathered

from either Data Warehouse or Data Lake or Data

Virtualization or all together or any other source.

Is there a need to read data directly from a low cost

storage system and do complex analysis/queries

involving multi-table joins with curated relational data,

using SQL/analytic functions which requires

performance and scalability ?

Duplicate Query. Refer to Sr.No. #210

785 35 33 Hardware Specifications

Next-Gen DW should support at-least 500

concurrent users, scalable up to 1000 users in next

5 years, running ETL/ELT jobs or doing ad-hoc data

extraction requests on database (Not including API

based access or scheduled job connections to

database)

What is the nature and expected concurrency of API

based access ? What is nature and concurrency of

scheduled job connections - are these ETL, or

maintenance related connections ?

Will the 1000 concurrent users be expected to be

running queris simultaneously, or is this just 1000

concurrent logons ?

Duplicate Query. Refer to Sr.No. #211

786 36 37 Hardware Specifications

Ad-hoc jobs of any complexity should not hamper

the scheduled jobs performance.

What is expected mix of queries in terms of tactical

(very short), medium, long running (reports), batch

loads, near real-time and real-time loads running

simultaneously on the system ?

Duplicate Query. Refer to Sr.No. #212

Page 45: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

787 18

Migration from Existing Setup to

Proposed Solution

Vendor should propose a detailed seamless

automated migration plan from existing setup to

proposed solution. Plan should focus on less

manual intervention, data reconciliation between

the systems and minimum parallel run of existing

and proposed solution.

1. How frequent are changes to the code ?

2. What is the defined process to capture the change

requests?

3. What version control tool is currently being used ?

4. What is their current release management process?

5. What is definition of Minimum Parallel Run ?

-6. Can the work be done from Teradata offshore

locations or has to be done from onsite?

7. What is the current Testing Strategy ? Is there any

document?

8. Is there any Performance related expectations?

9. What is the inventory numbers of the objects ( if

possible by complexity ) of the existing DWH which

needs to be migrated?

10. Can the legacy code be shared for pattern analysis

? ( If not complete code base, then can a sample be

shared ?)

11. What kind of Test Automation Tools are used

currently?

12. What is the typical availability of customer SMEs

for UAT ?

13. Does the Customer have the business

reconciliation queries? In other words, How do they

verify the data loaded in the current environment?

14. Does the customer have an existing Test/QA

environment ?

15. What kind of documentation is required as a part

of deliverables?

Duplicate Query. Refer to Sr.No. #213

788 18

Migration from Existing Setup to

Proposed Solution

Data migration from Staging and Data Marts, user

tables and any other schemas identified by Bank.

1. How many more data marts are there other than 4?

2. Are these data marts built on different databases?

3. Does their DWH comprise of multiple data marts

only or they have an Integrated EDW and model

already in place?

4. Is there any Subject Area Priority (logical split of the

Next Gen DW) & anticipated sizing?

5. What are the SLA times of current ETL jobs?

6. Does the large tables are horizontally partitioned?

Duplicate Query. Refer to Sr.No. #214

789 18

Migration from Existing Setup to

Proposed Solution

Data migration from existing archival solution to

new one.

1. What is the existing Archival Solution ? Duplicate Query. Refer to Sr.No. #215

790 18

Migration from Existing Setup to

Proposed Solution

Migration of existing data sourcing ETL jobs. 1. Do they maintain/update Data Mapping Sheet for

ETL /Data Ingestions ?

2. Is the data currently loaded in Batch or Mini-Batch ?

3. Do they have any Design and Coding Standards ?

4. Do they have document on the current ETL

Architecture/Solution, code patterns and their

complexity ?

5. Which Scheduler is being used ?

6. What is the current data volume and how much

data is ingested through different ingestion

mechanisms (batch, real-time etc) ?

7. Do they want to change any current tools ( ETL )

they have ? If yes, then what would be the tool stack?

8. Do they currently have an ETL Control Framework

implemented?

Duplicate Query. Refer to Sr.No. #216

791 19

Migration from Existing Setup to

Proposed Solution

Migration of monitoring dashboard data points 1. Is it to show the progress of migration underway?

2. Is there a need to build a migration framework for

future data migration ?

Duplicate Query. Refer to Sr.No. #217

792 19

Migration from Existing Setup to

Proposed Solution

Migration of Data Governance, Data Lineage and

Data Quality rules and policies

1. What are the existing Data Governance, Data

Lineage and Data Quality rules and policies ?

Duplicate Query. Refer to Sr.No. #218

793 19

Migration from Existing Setup to

Proposed Solution

Migration of All the remaining components of

existing ecosystem (Mentioned in Annexure - D) as

and when identified by Bank like job scheduler,

reports, history of version control, existing tape

backup, etc.

1. Will the access be provided to all systems - what

would be the constraints ?

Duplicate Query. Refer to Sr.No. #219

794 19

Migration from Existing Setup to

Proposed Solution

Vendor should list out all types of risks they expect

during the migration. Vendor should provide

justification if any downtime is required on existing

or proposed system during migration. Vendor

should provide all the pre-requisites for the

migration in the proposal.

1. What are the source systems (ERP, CRM etc.) does

the current DWH have ?

2. What is the current downtime schedule for their

existing DWH ?

Duplicate Query. Refer to Sr.No. #220

795 19

Migration from Existing Setup to

Proposed Solution

Vendor to review the existing architecture during

migration and remove duplication of data and

recommend improvements in overall setup if any

1. What are the deduplication rules ? Do they

currently have any defined?

Duplicate Query. Refer to Sr.No. #221

796 19

Migration from Existing Setup to

Proposed Solution

Vendor should provide a feasible plan for best use

of existing infrastructure which is procured during

last 10 years in staggered manner during the

implementation of Next-Gen DW which will save

cost to the Bank. (Annexure D gives the technology

architecture of the current setup)

1. Need more detailed information for their existing

Infrastructure and eco-system.

Duplicate Query. Refer to Sr.No. #222

797 13 Scope of Work Team structure (without actual profiles) Does the bank anticipate OEM involvement in

implementing core OEM related services via the

Systems Integrator

Duplicate Query. Refer to Sr.No. #223

798 13 Critical Functional Requirements -

Data Ingestion

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Ratio of split of Structured:Semi-Structured;

Unstructured (Images): unstructured (Videos) data.

This helps in solutioning taking into consideration

ground realities

Duplicate Query. Refer to Sr.No. #224

Page 46: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

799 25 Critical Functional Requirements -

Elements based Reporting

Vendor should follow the RBI guideline in

developing the solution with which it will be easier

for the Bank to migrate to the element-based data

reporting envisaged by the RBI.

Please elaborate with an example or 2 explaining

elements based Report to have a common

undersanding

Duplicate Query. Refer to Sr.No. #225

800 42 Annexure E- Functional Use Cases

(Risk Area)

General Would the bank expect Graph Analytics capabilities in

the solution for better network analytics which helps

determine the betweeness and strength of network

Duplicate Query. Refer to Sr.No. #226

801 25, 39 Critical Functional Requirements -

Regulatory Reporting

Automation –Tool should automate analytics and

reporting workflow end-to-end, including all data

collection, enrichment, and management, as well as

all calculations, processes to final report

submission. Currently 500+ jobs are being used for

Tranche 1 DCT generation along with 500 more for

other regulatory reports/returns.

Please provide the number of returns/reports for

regulatory body over and above the ones listed in

Annexure E (Sno 4)

Duplicate Query. Refer to Sr.No. #227

802 General General General By when is the RFP expected and by when is the bank

expecting to conclude this.

This information is not required at this stage of EOI.

803 General General General The reason for this question is, if the contract of

existing Datawarehouse ecosystem with existing

vendor is nearing completion, then this will have a

direct bearing on migration strategy as well as

extension of licencing till such time the migration from

existing to new system takes place (may take few

months)

Duplicate Query. Refer to Sr.No. #228

804 General General General We understand that SBI has an MDM solution, hence

the data quality issues, duplication must be under

control. What DQ tools and quality is expected for the

EDW

Duplicate Query. Refer to Sr.No. #229

805 39 Annexure D Existing Data Warehouse Architecture What are the current number of users on

BI/ETL/Database Level. User concurreny at

BI/ETL/Database. Also what is type/make/core

information of ETL/DWH/ODS/DM Servers.

Duplicate Query. Refer to Sr.No. #280

806 39 Annexure D Existing Data Warehouse Architecture Existing Backup Details are not provided , please

provide backup and other configuration details .

Existing backup are disk based /tape based/frequency

of backups

Duplicate Query. Refer to Sr.No. #281

807 19 Disaster Recovery

clause 1

Bank proposes to setup only functional DR to start

with. At later stage Bank may take decision to

setup full scale 100% DR.

Please elaborate on functional DR in terms of PROD

capacity . Is DR being looked from Day 1?

Duplicate Query. Refer to Sr.No. #282

808 19 DR Clause 4 The DR solution should be synced with production

NEXT-GEN DW. The SLA for RTO should be

maximum 2Hrs as per Bank’s defined policy.

Since the volumes involved are large , the bandwidth

capacity and the time slots for replication will be

provided by the Bank. Please clarify

Duplicate Query. Refer to Sr.No. #283

809 19 DR Clause 6 The proposed solution is expected to have a

monitoring engine that can determine the health of

production NEXT-GEN DW and raise alerts / trigger

remedial actions to bring NEXT-GEN DW – DR as

the default NEXT-GEN DW

Bank wishes to have an automation tool for the same .

Please clarify.

Duplicate Query. Refer to Sr.No. #284

810 32 HW Specs clause 10 Vendor must provide detailed configuration of the

proposed Hardware, including Hosting Space

Requirements, Racks, Power, Cooling and any other

requirement for the fulfillment of the Vendor’s

obligation in this EOI.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

Duplicate Query. Refer to Sr.No. #285

811 35 HW Specs Clause 30 Vendor is required to provide the minimum

resources to monitor & manage the infrastructure,

however it is the Vendor’s responsibility to right

size the resources to meet the SLA

Clarification needed on Bank's expectations on

number of resources. Can you please explicitly

mention the SLA requirements.

Duplicate Query. Refer to Sr.No. #286

812 36 HW Specs Clause 44 Vendor need to propose a solution for data

migration / transfer between Existing DWH (Navi

Mumbai Location 1) and NEXT-GEN DW-PR (Navi

Mumbai Location) and also between NEXT-GEN DW-

PR (Navi Mumbai Location 2) and Hyderabad (DR)

or any other places for PR and DR decided by the

Bank.

Please share details of locations , approx distances ,

bandwidth capacity to be provided by Bank. Please

clarify.

Duplicate Query. Refer to Sr.No. #287

813 36 HW Specs Clause 48 The vendor should provide EXACT size needed for

production in the 1st year and estimated sizes for

consecutive years keeping in view the growth rate

predicted by Bank in this section and provide

empirical evidence for the calculation of growth

rate.

For exact sizing various inputs/intercation will be

required . For EOI an approx. indication should suffice

. Please confirm

Duplicate Query. Refer to Sr.No. #288

814 39 Annexure D Existing Data Warehouse Architecture Please share : 1. Detailed architecture diagram of

existing EDW & 2. Breakup of tiers among the

80+ production servers.

Duplicate Query. Refer to Sr.No. #289

815 46 Annexure G Next Gen Data Warehouse Sizing DR to be sized only for DWH . Please confirm Duplicate Query. Refer to Sr.No. #290

816 40 Annexure E Functional Use Cases Is Bank looking to leverage existing OFSAA

implementation for these use cases . Alternatively willl

the OFSAA feed from the DWH. Please let us know the

interplay between DWH and OFSAA.

Duplicate Query. Refer to Sr.No. #291

817 40 Annexure E Functional Use Cases - does the bank see the Next-Gen-DW as a

complement to existing data solutions at the

bank (e.g. OFSAA) ?

Duplicate Query. Refer to Sr.No. #292

818 40 Annexure E Functional Use Cases - role  of Next-Gen-DW in mission critical

operational processes e.g. daily RBI and regulator

reporting, related time financial crime detection,

etc.

Duplicate Query. Refer to Sr.No. #293

819 12 Annexure B End State Objective : Migration form existing setup Please provide complete detail of existing DWH

Solution. Information including but not limited to:

DWH platform, model, version and Size

Details of CPUs, Memory, OS, Database version

Details of Storage configuration: Size, capacity, free

and used space etc

Whether there are any physical or logical isolation of

DWH setup

HA, Back, DR details

Duplicate Query. Refer to Sr.No. #701

Page 47: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

820 13 Annexure B Performance Benchmark of Next Gen DWH Oracle provides Industry standard globally acceptable

metrices to measure Database performance like IOPS,

throughput and Load rate. We believe this addresses

the requirement

Duplicate Query. Refer to Sr.No. #295

821 16 4 Storage replication (e.g. RAID) should be

automatically managed by the platform.

We believe this is a RAID level if not then please

elaborate more

Duplicate Query. Refer to Sr.No. #296

822 16 8 Storage should support data compression. It should

be possible to perform both fast compression and

efficient compression based on data processing

needs.

We recommended Bank to additionally ask for Storage

and DWH systme capable of providing Coumnar based

as well row based compression. And data should be

readable without decompression

Duplicate Query. Refer to Sr.No. #297

823 16 9 The storage should be horizontally and vertically

scalable. Redistribution of data across the NEXT-

GEN DW should be possible automatically and

seamlessly.

We recommended Bank to additionally ask for

"Storage upgrade and cpacity increase should be done

without dontime

Duplicate Query. Refer to Sr.No. #298

824 16 11 The storage system should be robust to handle at

least 1,50,000 concurrent queries (Select/DML) by

processing engines / ETL jobs / end users scalable

up to 6,00,000 concurrent queries in next 5 years

(assuming parallelism of 100 degree).

1. Please provide details of queries and ETL jobs

2. Please provide ratio of Select/DML queries and ETL

jobs

3. Share the complexity of query- simple, medium,

complex Queries

4. Also provide Query defination of Simple, Medium

and Complex queries

Duplicate Query. Refer to Sr.No. #299

825 16 12 Downstream departments (data consumer) to be

given separate processing power, storage to

undertake their requirements with separate DB

snapshot, Audit trails should be available for any

user accessing the Databases. Construction of this

separate Database snapshot and enabling this audit

trails must not cause any major systemic

issues/challenges in smooth functioning of primary

DB.

Please provide rationale and logic to have separate

processing and storage for this. Powerful DHW

systems are today capable of servicing all consumer

groups in parallel

Duplicate Query. Refer to Sr.No. #300

826 17 12 ETL/ELT tool for data extraction should be AI/ML

features for suggesting / improving Query / ETL /

ELT Stages

Please clarify further with example Duplicate Query. Refer to Sr.No. #301

827 17 13 Existing reports and extracts generation jobs on

DWH should be analyzed and transformed to the

NEXT-GEN DW. The vendor should use preferably

off-the- shelf tools and not resort to building from

scratch.

Please share sample reports and extracts to propose

most suitable options for migration

Duplicate Query. Refer to Sr.No. #302

828 17 Data transformations should be triggered in

parallel. The NEXT-GEN DW should be capable to

run multiple transformation jobs in parallel. The

NEXT-GEN DW should be able to run at-least 1500

jobs in parallel, scalable up to 5000 in next 5 years,

of varying complexity - simple, medium, complex, in

batch or near real

plz. Define complexity - simple, medium, complex

Share the split % of simple, medium, complex

Plz. Share sample jobs for each type

Duplicate Query. Refer to Sr.No. #303

829 27 User Management: Pt4-The access privileges

associated with each system product, e.g.

operating system, network, database, application

and system utilities, and the users to which these

privileges need to be allocated should be clearly

identified and documented.

Should we assume that the access privileges are to be

assigned to the users directly and managing access to

these privileged accounts is not required?

Duplicate Query. Refer to Sr.No. #304

830 39 User database of 30000+ officials Should we assume approx 30k users would access the

solution with a YoY increase of 10%?

Duplicate Query. Refer to Sr.No. #305

831 30 6 Reporting on all types of available of Data Formats;

· Structured, semi-structured, unstructured

· Click stream data

· Audit Logs

· Documents

· Multimedia data (Images/Videos/Audios)

· XBRL format

· IRIS iFILE framework

Please clarify on the type of database used for the

each of data formats asked for. Or Is it safe to assume

the underline data will be in as per industry specified

relational data format like Oracle , DB2 etc.

Duplicate Query. Refer to Sr.No. #306

832 30 5 Visualizations: BI tools must provide below

different types of visualizations;

· Animations, Barcodes

· Bar, line, pie, area and radar chart types

· Tables, Graphs, Infographics, Filters

· Widgets

· Drag and Drop Creation, Customization

· Templates

· Freehand SQL Command

· Geospatial Integration

· Layouts

· Themes

· Ability to mix and match various combinations

Please elaborate on the definition of

-Animations

-Infographic :what all visualization are you referring to

-Widgets, Templates

Duplicate Query. Refer to Sr.No. #307

833 32 22 In-memory analytics: The product should pull data

into an in-memory or locally cached data store

preferably columnar is an increasingly popular

feature that enables very fast analytics once the

data is loaded.

To achieve the fast analytics, BI tool may adopt

different architecture. BI tool can easily leverage the

in-memory benefits of underline database without

pulling and creating data redundancy and henceforth

reducing the data manageability at BI Layer . Request

you to rephrase the this point as

"In-memory analytics: The product should pull data

into an in-memory or leverage the In-Memory

capabilities of underline database Or locally cached

data store preferably columnar is an increasingly

popular feature that enables very fast analytics once

the data is loaded."

Duplicate Query. Refer to Sr.No. #308

834 32 23 Offline updates: BI tools, when storing copies of

the source data in an online analytical processing

(OLAP) cube or in-memory columnar data store,

should enable business users to schedule

automatic data updates.

Different BI tools have different architecture. BI tool

can easily leverage the capabilities of underline

database. Is this point relevant to the BI Tools whose

architecture is to store the data with BI Server.

Duplicate Query. Refer to Sr.No. #309

Page 48: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

835 32 28 Speed of access: Query performance will vary based

on the complexity of the queries and the amount of

data involved. Dashboards with multiple

visualizations will need to get query results from

many queries. The best practice is to create several

prebuilt query scenarios and compare how each

product performs based on these specific

examples. The worse practice is to just arbitrarily

rate the speed.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Duplicate Query. Refer to Sr.No. #310

836 32 29 The best practice is to establish a testing

environment to determine scalability in terms of

both the number of concurrent users and data

metrics, such as volumes, variety and veracity.

These seems to be best practices for implementation

.Please elaborate what is the requirement from BI

Tool

Duplicate Query. Refer to Sr.No. #311

837 32 32 Ability to handle and summarize huge volumes of

data. E.g. 30-40 million rows accessed on index and

summarized over 5 to 8 metrics.

Please elaborate the use case for consumption of 30-

40 million rows from BI. Usually BI tool leverages the

underline database to do summarization of data and

only works on the resulted dataset

Duplicate Query. Refer to Sr.No. #312

838 35 36 The web portal of Business Intelligence tool should

support at-least 25000

concurrent users, scalable up to 75000 in next 5

years, accessing various reports generated

For doing the sizing of Business Intelligence we need

bifurcation of the concurrent users (25000)

Total Concurrent Users : 25000

Number of concurrent active : provide the concurrent

active user count

Number of logged/in-active : provide the

loggedin/active user count

Out of Active Users:

- Users executing BIEE dashboards (having 4/5 reports

or simple charts in a dashboard)

- Users executing large Pivot table operations (25000+

rows)

- Users executing (small to medium sized report - 50K

cells or lower) export to pdf/XL operations

- Users executing very heavy Graphics

Number of Active Concurrent running Extra Large

Reports

(Usually Extra Large Reports are executed off-line

hours)

Duplicate Query. Refer to Sr.No. #313

839 15 point # 16 Proposed solution should be able to scrap

encrypted log, capture Metadata

changes at source level completely, scrapping 4000-

5000 logs daily having log

size of ~ 2 TB each scalable up to 10000 logs.

Proposed solution should be

capable of scrapping logs generated by any type of

Database. E.g. Oracle

Database, IBM DB2 Database etc.

Kindly specify if decryption of logs is also required or

only storage of such logs is fine ? If yes, is there

decryption logic available in specified system ?

Duplicate Query. Refer to Sr.No. #314

840 38 Annexure C -Monthly Data processed

in DWH Warehouse

Archived log extract CBS (SBI) +

TF (SBI)

Are these logs encrypted?

Do we need to keep the RAW logs into the system? Or

only processed logs ?

Duplicate Query. Refer to Sr.No. #315

841 29 # 9 (Data Science Platform with

AI/ML Capabilities)

GPUs to be incorporated in solution if possible

using HDFS Hadoop like

environment for better analytical results

1. Is there requirement to run AI/ML models within

HDFS Hadoop ? Or Expectation is to pull the data into

GPU based analytics workbench and then process.

2. Running AI/ML models within Hadoop is also faster

and Having separate GPU based system for specific AI

models can reduce the cost of GPU based solution.

Please suggest

Duplicate Query. Refer to Sr.No. #316

842 15 # 1 - Data Storage A multi-temperature data management solution to

be proposed by vendor where

data that is frequently accessed on fast

storage—hot data—compared to lessfrequently

accessed data stored on slightly slower

storage—warm data—and

rarely accessed data stored on the slowest storage

—cold data. System should

also be capable automated storage tiering and

seamless data transfer between

hot, warm and cold storage. Data residing in any of

these storage areas must be

seamlessly mixed / merged according to

requirements without impacting

performance.

Kindly share the tentative timeline for Hot/Warm/Cold

data so that we could calculate the size. Example: Hot

Data - 6 months, Warm Data - 1 year, Cold data > 1

year etc.

Duplicate Query. Refer to Sr.No. #317

843 18 18 Transformations for this activity can be categorized

into the following types:

· Existing transformations in DWH that needs to be

migrated to NEXT-GEN DW

Please share the existing transformation details. Duplicate Query. Refer to Sr.No. #318

844 20 1 Data older than specific duration as identified by

Bank to be archived in low cost cold storage.

Changing data archival rules should be easily

configurable. Vendor to propose solution for the

same with cheap and flexible storage and

processing

Please provide Retention period Duplicate Query. Refer to Sr.No. #319

845 20 5 Store backup of entire ecosystem on suitable cost-

effective, fast recovery infrastructure (Currently

tape backup is taken)

Please provide the details of existing Tape Backup

Solution, Backup Window, Backup Throughput and

Restoration throughput

Duplicate Query. Refer to Sr.No. #320

Page 49: State Bank of India - Sl. No EOI Page EOI Clause …...4 20 Critical Functional Requirement - Monitoring Dashboard # 1 Real Time data flow in dashboard Query: What is the SLA on Real

846 13 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Ingestion Sr #2

Data may be structured, semi-structured, and

unstructured. It may come from internal or external

sources. It may come in batches, incremental

additions or real-time feeds. There should be no

limitation on the type, format and size of data

ingested. Data may include log, feeds, audio, video,

image, NOSQL, RDBMS, unstructured text, through

ERP systems, etc

Which RDBMS source data is required to be extracted

in real-time mode? Please provide the source system

name, RDMS type (Oracle/ SQLServer etc) and the

underlying OS

Duplicate Query. Refer to Sr.No. #321

847 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #5

Migration of existing data extraction and reporting

jobs.

Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #322

848 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #6

Migration of monitoring dashboard data points. Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #323

849 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #7

Migration of user details. Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #324

850 19 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements >Migration from

Existing Setup to Proposed Solution

Sr #8

Migration of Data Governance, Data Lineage and

Data Quality rules and policies

Since this is across different products, is this expected

to be semi-automated / manual?

Duplicate Query. Refer to Sr.No. #325

851 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Will these also be flat files? If not, what will be the

interface mode (RDBMS, webServices, API etc) and

which RDBMS?

Duplicate Query. Refer to Sr.No. #326

852 38 Annexure C - Monthly Data

processed in DWH Warehouse Sr #4

Contribution from other source systems like DMAT,

CMP, SBI Life, LOS, etc.

Please provide a count of data sources (best

approximation, and of these how mant will be flat file

sources?

Duplicate Query. Refer to Sr.No. #327

853 23 Annexure B - Technical Criteria/Scope

of Work > Critical Functional

Requirements > Data Quality Sr #12

Mechanism to capture feedback from end users to

report Data Quality issues

Please elaborate. Can this be implemented using

enterprise collaboration tooling / ticket maintenance

system?

Duplicate Query. Refer to Sr.No. #328

854 39 Annexure D - Existing Data

Warehouse Architecture Sr #14

Data Quality What are the existing Data Quality details? How many

and which entity masters are maintained? What is the

current count of each type of Entity and how are their

counts expected to scale up (volumetrics)?

Duplicate Query. Refer to Sr.No. #735


Recommended