+ All Categories
Home > Documents > Statistica-Release-notes_125.pdf

Statistica-Release-notes_125.pdf

Date post: 11-Nov-2015
Category:
Upload: cesar-gutierrez
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
12
Dell Software August 2014 1 What’s new in Dell Statistica™ v12 Overview Better. Bigger. Faster. An explosive combination of Big Data growth, digital storage capabilities, and technological advances has forever altered the modern business analytics landscape. The application of analytic tools and decision making is no longer limited to the realm of data scientists, computer programmers, engineers, and the like. Rather, analytics are now being integrated into day-to-day tasks across all departments, utilized by project managers, business analysts, predictive modelers, customer agents, and executive leaders who need access to sensible, actionable information. People who need visual user interfaces to create, consume, and share KPIs, graphs, reports, slide presentations, and more. To meet these changes head on, we made Statistica even faster, more flexible, and more functional than ever: We boosted the Big Data performance of the entire product line. We added a visual user interface to write SQL queries with the new Advanced Query Builder in all products. We reinvented the visual analytic workspace in Statistica Enterprise and Data Miner for a more intuitive user experience, with greater visual workflow and storage capabilities to help users understand and communicate their findings. We strengthened the predictive/prescriptive capabilities of Decisioning Platform. We introduced the highly flexible Reporting Tables product that enables users to visually build tables of summary statistics and use them in presentations and other reports. We developed new nodes, such as the practical Data Health Check that facilitates cleanup of a large number of variables. With the rollout of Statistica 12 in April 2013, we’ve built on nearly 30-year legacy of exceeding customer expectations, furnishing this ever-growing business landscape with a host of relevant features and performance improvements that will make our analytic solutions even faster, more accessible, and more effective for business leaders and power users alike. We fit into your IT world better than any alternative. Whether handling medium data or Big Data, Statistica 12 takes greater advantage of existing data warehouses and IT tools than ever before, helping move businesses even closer and faster to meaningful ROI. All components Advanced Query Builder Advanced Query Builder (AQB) makes it possible for even non-technical staff to write complex queries to retrieve data. It has a new visual user interface to build queries (dragging, dropping, nesting, selecting). The application's parsing engine determines the current context.
Transcript
  • Dell Software August 2014 1

    Whats new in Dell Statistica v12

    Overview Better. Bigger. Faster. An explosive combination of Big Data growth, digital storage capabilities, and technological advances has forever altered the modern business analytics landscape. The application of analytic tools and decision making is no longer limited to the realm of data scientists, computer programmers, engineers, and the like. Rather, analytics are now being integrated into day-to-day tasks across all departments, utilized by project managers, business analysts, predictive modelers, customer agents, and executive leaders who need access to sensible, actionable information. People who need visual user interfaces to create, consume, and share KPIs, graphs, reports, slide presentations, and more. To meet these changes head on, we made Statistica even faster, more flexible, and more functional than ever:

    We boosted the Big Data performance of the entire product line.

    We added a visual user interface to write SQL queries with the new Advanced Query Builder in all products. We reinvented the visual analytic workspace in Statistica Enterprise and Data Miner for a more intuitive user

    experience, with greater visual workflow and storage capabilities to help users understand and communicate their findings.

    We strengthened the predictive/prescriptive capabilities of Decisioning Platform. We introduced the highly flexible Reporting Tables product that enables users to visually build tables of

    summary statistics and use them in presentations and other reports.

    We developed new nodes, such as the practical Data Health Check that facilitates cleanup of a large number of variables.

    With the rollout of Statistica 12 in April 2013, weve built on nearly 30-year legacy of exceeding customer expectations, furnishing this ever-growing business landscape with a host of relevant features and performance improvements that will make our analytic solutions even faster, more accessible, and more effective for business leaders and power users alike. We fit into your IT world better than any alternative. Whether handling medium data or Big Data, Statistica 12 takes greater advantage of existing data warehouses and IT tools than ever before, helping move businesses even closer and faster to meaningful ROI.

    All components Advanced Query Builder Advanced Query Builder (AQB) makes it possible for even non-technical staff to write complex queries to retrieve data. It has a new visual user interface to build queries (dragging, dropping, nesting, selecting). The application's parsing engine determines the current context.

  • Dell Software August 2014 2

    Offering features usually found only in specialized applications, AQB can build left, right, and full outer joins graphically; can build queries with aggregate functions; is capable of building complex queries involving unions and minus operations; can graphically represent complex SQL queries and ER diagrams; and can provide the means for SQL dialect to be changed when the universal default is not practical.

    Spreadsheet Improvements NEW FILE FORMAT FOR BETTER SUPPORT OF BIG DATA Statistica now features a new data file format that is optimized for Big Data by supporting variable storage length for text variables. When text variables include sparsely populated columns, the space occupied by those values is automatically optimized, reducing spreadsheet sizes sufficiently to produce significant performance improvements. SPREADSHEET VIRTUAL VARIABLES Spreadsheets now use virtual variables that can be specified by formula and evaluated at run time, requiring no real storage. These virtual variables are added or deleted behind the scenes without needing to rewrite entire spreadsheet data sections, so users will notice only enhanced performance. New data hides in a separate vector on disk and is reunited with the original spreadsheet when data is saved. This especially adds significant performance improvements to large spreadsheets when you need to add transformed variables. INCREASE IN TEXT LABELS Text Label support in spreadsheets has now been increased to millions of distinct labels with significant performance improvements for name/value lookup. This makes Text Labels a good choice for text fields with large numbers of distinct values, inheriting all the performance benefits from a fixed storage size of the numeric value and avoiding duplication of repeated values.

  • Dell Software August 2014 3

    AGGREGATE FUNCTION IN OLE DB PROVIDER FOR Statistica SPREADSHEETS The OLE DB provider now allows for the utilization of aggregate functions such as average, count, max, min, or sum. IMPORTING TEXT FILES USING AUTO-FIXED IMPORTING VARIABLE OPERATIONS This enhancement to Statistica provides the ability to take blocks of data that contain fixed-length pieces of information, and specify the fixed length to import variable- specific information.

    Statistica now has the option for a Fixed import setting.

    Data Visualization Several new options have been added to provide additional features and tools for visualizing data.

    "Orthogonal regression" fit type is now supported in 2D scatterplots

    Points on graphs can now be annotated

    New options in compound graphs improve visual appearance by controlling the scaling display

    A new data file can be created by brushing the points to be included

    Date and time support was added for meaningful time intervals in graph scales

    Now you can modify the margins of all plots in an original graph (e.g., multi-graph layout)

    Create Pareto charts more easily

    We added a new graph type, the parallel coordinate plot, which shows multiple variables, side-by-side, on comparable scales, thus making it easier to compare values across variables (see below)

  • Dell Software August 2014 4

    Each Y-axis corresponds to a variable in a Statistica spreadsheet and can be defined according to standalone values or two-sided values (e.g., range boundaries, upper and lower limits, etc.)

    Statistics FALSE DISCOVERY RATE False Discovery Rate (FDR) and Qvalues were added. FDR performs the Benjamini and Hochberg method, and Qvalues performs the method described in the 2002 Storey paper . NEW DISTRIBUTIONS New distributions were added to the Probability Distribution Calculator, STATISTICA Visual Basic functions, and spreadsheet functions. These are for hypergeometric distributions (inverse, cumulative, prob) and the inverse P oisson and inverse binomial distributions. STEPWISE MODEL BUILDER (Statistica ADVANCED) Stepwise Model Builder provides control over model building and gives the modeler a what-if environment. This is useful when regulation or a companys standard practices limit which variables can be used to build models. For example, a bank cannot discriminate based on age or gender. NEGATIVE BINOMIAL DISTRIBUTION (Statistica ADVANCED) This new option is available within GLZ. It enables you to specify the Negative Binomial as the distribution for the response variable. This specific form is referred to as the Poisson-Gamma mixture form and is the discrete analog to the continuous gamma distribution. QUALITY CONTROL CHARTS (Statistica QUALITY CONTROL) Quality Control now includes options that can set the background color for in control, out of control, and out of warning lines on quality control graphs.

    Other MICROSOFT OFFICE 2010 STYLE TOOLBARS Statistica now uses the Office 2010 style toolbars. The Help menu has been moved to the File tab. SEARCH FACILITY Now you can search for modules by name, select a module, and start it. This feature indexes all available ribbon bar options and displays them alphabetically. Typing in the search box will start restricting the list to those entries that match any of the words from the ribbon bar option. Pressing ENTER will open the selected modules dialog box. HIGH RESOLUTION DPI 120 SUPPORTED Starting with the release of Microsoft Vista and the greater availability of very high resolution monitors, Microsoft made it much easier to change DPI. And for Windows 7, themes come with a default of DPI 120 for high resolution. This resolution is now supported with Statistica.

    Data Miner Data Miner Workspace Enhancements The Workspace has been upgraded to include a large number of new features to improve usability and performance, especially with respect to handing very large data sets.

  • Dell Software August 2014 5

    A new system of nodes has been introduced with enhancements of the user interface to closely resemble the user interface in the respective modules. The previous nodes are still offered and supported for backwards compatibility. ENHANCED ABILITY TO IMPORT EXCEL FILES Statistica now has the ability to import Excel files using the nomenclature of Excel spreadsheets: letters for columns and numbers for cases.

  • Dell Software August 2014 6

    This functionality is not only available interactively, but is also translated to the Workspace utilizing the newImport Excel node.

    You can use this node to import Excel data directly from a spreadsheet into a Workspace. Analytic Enhancements DATA HEALTH CHECK The Data Health Check node is new in Statistica 12 and is available to all Statistica Data Miner users. This node detects common data issues for each variable, completes basic data cleaning, and generates a report that can be used in deciding how to further clean the data. The Data Health Check node is especially useful for exploring a large number of variables automatically. CONSTRUCTION OF TREES, SENSITIVITY ANALYSIS This new sensitivity option enables you to learn more detail about a specific node. You can then use this knowledge to redefine the splits of the proposed tree in an expert way. ORDERED TWOING CRITERION This is an option to treat categorical dependent variables in order. It is useful when categories represent levels (low, medium, high). PREDICTOR SCREENING This is a new method for analyzing predictors that was added to Feature Selection. This functionality can be used as a quick, first look at a predictor to provide a basic set of statistics.

    Data Access Enhancements TERADATA CODE DEPLOYMENT (Statistica DATA MINER WITH CODE GENERATOR) User-defined functions can now be defined for the Teradata database, which allows for in-database scoring.

  • Dell Software August 2014 7

    Enterprise Enterprise Workspace Enhancements The Workspace has been upgraded to include a large number of new features to improve usability and performance, especially with respect to handing very large data sets.

    A new system of nodes has been introduced with enhancements of the user interface to closely resemble the user interface in the respective modules. The previous nodes are still offered and supported for backwards compatibility. ENHANCED ABILITY TO IMPORT EXCEL FILES Statistica now has the ability to import Excel files using the nomenclature of Excel spreadsheets: letters for columns and numbers for cases.

  • Dell Software August 2014 8

    This functionality is not only available interactively, but is also translated to the Workspace utilizing the new Import Excel node.

    You can use this node to import Excel data directly from a spreadsheet into a Workspace.

  • Dell Software August 2014 9

    Analytic Enhancements

    DATA HEALTH CHECK The Data Health Check node is new in Statistica 12 and is available to all Statistica Enterprise users. This node detects common data issues for each variable, completes basic data cleaning, and generates a report that can be used in deciding how to further clean the data. The Data Health Check node is especially useful for exploring a large number of variables automatically. REPORTING A new enhancement is the selection of spreadsheet cells into dynamic tags, which allows inserting the value of a particular cell into the text of a report and can be used for both text (including paragraph text strings) and numeric values. Individual workbook items can be specified as dynamic tags, making it possible for these items to be included in reports. Additionally, Statistica now supports an expanded list of keyword tags, including workflow name, SDMS version numbers, and more. QUALITY CONTROL CHARTS Statistica Enterprise now supports full color and pattern control for the elements of QC charts, in the same manner that these options are supported in the interactive usage of Statistica. These controls are accessible from inside the Enterprise Manager application.

    Data Access Enhancements

    SVB DATA CONFIGURATIONS With SVBData Configurations, you can access non-traditional databases that dont have an ODBC or OLE DB provider. As an example, a large text file can be thought of as a database if someone desired to obtain its data. As a text file, however, it does not have an ODBC or OLE DB provider. But with an SVB Data Configuration, it is possible to access this text file as a database and make its data available to Statistica. If you want to execute different queries based on predetermined conditions, those conditions can also be coded into the SVB Data Configuration. GENERAL DOCUMENT STORE Files can now be saved/opened within the Enterprise System View , so Statistica documents and other document types can be stored within Enterprise Manager and shared among users outside a file share. The Enterprise System View is the default destination for saving reports. Additionally, standard Statistica Enterprise permissions and SDMS versioning are supported. SVB and SVX code can be stored within Enterprise using the General Document store. Now all the places in Enterprise that use SVB can reference the stored code; changing the code in one place can simultaneously implement that change in SVB Analysis Configurations, SVB Data Configurations, Workspace node code, and Secondary SVB Programs within Enterprise. BROWSER SUPPORT (Statistica ENTERPRISE SERVER) Support is provided for all main stream browsers: Internet Explorer, Chrome, Firefox, Safari, and Opera. This makes it possible for you to use Statistica Enterprise Server from your iPad or laptop. WORKBOOK SUPPORTED (Statistica ENTERPRISE SERVER) Workbooks can now be shared easily with others through the Statistica Enterprise Server Portal. After a file is published, a Download from Server link (URL) will be provided. Versioning Support (Statistica Enterprise Compliance Edition) Statistica Enterprise Compliance Edition is an integration of Statistica Enterprise with a highly scalable document management system that enables you to securely manage documents of any kind, and it is designed to ensure compliance with FDA 21 CFR Part 11 regulations, Sarbanes-Oxley legislation, as well as ISO 9000, 9001, and 14001

  • Dell Software August 2014 10

    documentation requirements. New functionality provides for easy version comparison and opening of previous versions of documents. VERSION COMPARISON Now when SDMS integration is enabled, you can compare different versions of SDMS objects in Enterprise Manager. Each versionable Enterprise object will have a text representation: Data Configuration list of query, data types, and OLE DB column properties

    IQC Analysis Configuration summary of QC settings/parameters

    SVB Analysis Configuration SVB text and properties

    Rules object text representation of rules

    PMML object PMML representation of model

    Workflow text detailing all contained nodes and parameters OPEN PREVIOUS VERSION For those versionable objects that can be opened directly in Enterprise, including Workspaces, PMML, and Rules objects, Statistica will allow a specified previous version of the object to be opened as a read-only object. Labels (Statistica Web Data Entry) Labels are used with the Data Entry product. Labels can now be stored in one or more system folders. Customers will find it easier to manage Labels with this new option.

    Scorecard CALIBRATION TESTS Calibration Tests is a tool that makes it possible to compare the forecast probability of default (PD) with the realized PD that eventually occurs. A typical use case in financial institutions is to divide customers into segments of like customers, realizing that each separate segment will have a certain number of customers who meet credit obligations and a certain number who will not. Based upon the model the financial institution has agreed upon, each segment has a forecast PD. After the model has been used for a period of time, the accuracy of the model must be tested. Performing such tests is very easy in Statistica, which even includes a built-in "traffic light approach" described in a popular reference on guidelines in credit risk management (Oesterreichishe Nationalbank, 2004). RULES Statistica Scorecard is now integrated with Statistica Decisioning Platform. This tool can now generate rules for batch scoring or live scoring.

    Decisioning platform Versioning Support StatisticaCompliance Edition is an integration of Statistica with a highly scalable document management system that enables you to securely manage documents of any kind, and it is designed to ensure compliance with FDA 21 CFR Part 11 regulations, Sarbanes-Oxley legislation, as well as ISO 9000, 9001, and 14001 documentation requirements. New functionality provides for easy version comparison and opening of previous versions of documents. VERSION COMPARISON Now when SDMS integration is enabled, you can compare different versions of SDMS objects. Each versionable object will have a text representation:

    Data Configuration list of query, data types, and OLE DB column properties

    IQC Analysis Configuration summary of QC settings/parameters

    SVB Analysis Configuration SVB text and properties

  • Dell Software August 2014 11

    Rules object text representation of rules

    PMML object PMML representation of model

    Workflow text detailing all contained nodes and parameters Weight of Evidence This new product is important to anyone engaged in binary prediction (yes/no). This tool automates a time- consuming task to bin predictors. Two methods are used:

    Optimal

    Interpreted (e.g., observed risk of prediction probability) Rules Builder Every organization has rules that govern its behavior. Consistently applying these rules to analytic projects or reports is a common challenge. Rules Builder solves this problem. Business users, developers, or modelers find it easy to create, maintain, share, and re-use sets of rules. A rule set for data transformation could be created and then used by one or thousands of analytic projects. Role-based security controls access to these rules. Rules Builder has the ability to conditionally execute models with pre-scoring segment rules and then apply post-scoring policy rules. Rules can retrieve reason codes for individual predictions, which can be critical for many industries, such as banking or insurance. For example, banks are required to state why a loan application was denied. The execution of rules can be visually traced with sample data to aid in troubleshooting complex scenarios.

    New components StatisticaReporting Tables (optional) Businesses are challenged to:

    Summarize large amounts of data into formats that are easily understood

    Easily emphasize particular data segments (e.g. , only report on Oklahoma and France)

  • Dell Software August 2014 12

    Statistica Reporting Tables (an optional product to be purchased separately for Version 12) automatically sorts and summarizes data based on specifications made while developing the table. The tables are generated interactively by visually dragging and dropping variables into the appropriate four sections of the Reporting Tables dialog box (Layers, Column Label, Row Label , and Sigma). As the tables are customized, they can be previewed, and final results can be generated with the click of a button.

    Options are available for processing Multiple Response Categories, Crosstable Groups , and Conditional Formatting.


Recommended