No
Yes
View More
View Less
Working...
Close
OK
Cancel
Confirm
System Message
Delete
Schedule
An unknown error has occurred and your request could not be completed. Please contact support.
Scheduled
Wait Listed
Personal Calendar
Participant
Conference Event
Meeting
Interest
There aren't any available sessions at this time.
Conflict Found
This session is already scheduled at another time. Would you like to...
Loading...
Please enter a maximum of {0} characters.
{0} remaining of {1} character maximum.
Please enter a maximum of {0} words.
{0} remaining of {1} word maximum.
must be 50 characters or less.
must be 40 characters or less.
Session Summary
We were unable to load the map image.
This has not yet been assigned to a map.
Search Catalog
Reply
Replies ()
Search
New Post
Microblog
Microblog Thread
Post Reply
Post
Your session timed out.
SAS Global Forum 2016
Add to My Interests
Remove from My Interests
 

10000 - A Macro That Can Fix Data Length Inconsistency and Detect Data Type Inconsistency Common tasks that we need to perform are merging or appending SAS® data sets. During this process, we sometimes get error or warning messages saying that the same fields in different SAS data sets have different lengths or different types. If the problems involve a lot of fields and data sets, we need to spend a lot of time to identify those fields and write extra SAS codes to solve the issues. However, if you use the macro in this paper, it can help you identify the fields that have inconsistent data type or length issues. It also solves the length issues automatically by finding the maximum field length among the current data sets and assigning that length to the field. An html report is generated after running the macro that includes the information about which fields’ lengths have been changed and which fields have inconsistent data type issues. 10 Minutes Quick Tip Ting Sa
10040 - Something Old, Something New: Flexible Reporting with DATA Step-Based Tools The report looks simple enough—a bar chart and a table, like something created with GCHART and REPORT procedures. But there are some twists to the reporting requirements that make those procedures not quite flexible enough. It's often the case that the programming tools and techniques we envision using for a project or are most comfortable with aren't necessarily the best to use. Fortunately, SAS® can provide many ways to get results. Rather than procedure-based output, the solution here was to mix "old" and "new" DATA step-based techniques to solve the problem. Annotate data sets are used to create the bar chart and the Report Writing Interface (RWI) is used to create the table. Without a whole lot of additional code, you gain an extreme amount of flexibility. 50 Minutes Breakout Pete Lund
10080 - Using the REPORT Procedure to Export a Big Data Set to an External XML File A number of SAS® tools can be used to report data, such as the PRINT, MEANS, TABULATE, and REPORT procedures. The REPORT procedure is a single tool that can produce many of the same results as other SAS tools. Not only can it create detailed reports like PROC PRINT can, but it can summarize and calculate data like the MEANS and TABULATE procedures do. Unfortunately, despite its power, PROC REPORT seems to be used less often than the other tools, possibly due to its seemingly complex coding. This paper uses PROC REPORT and the Output Delivery System (ODS) to export a big data set into a customized XML file that a user who is not familiar with SAS can easily read. Several options for the COLUMN, DEFINE, and COMPUTE statements are shown that enable you to present your data in a more colorful way. We show how to control the format of the selected columns and rows, make column headings more meaningful, and how to color selected cells differently to bring attention to the most important data. 30 Minutes E-Poster Guihong Chen
10081 - An Application of the PRINQUAL Procedure to Develop a Synthetic Index of Customer Value for a Colombian Financial Institution Currently Colpatria, as a part of Scotiabank in Colombia, has several methodologies that enable us to have a vision of the customer from a risk perspective. However, the current trend in the financial sector is to have a global vision that involves aspects of risk as well as of profitability and utility. As a part of the business strategies to develop cross-sell and customer profitability under conditions of risk needs, it's necessary to create a customer value index to score the customer according to different groups of business key variables that permit us to describe the profitability and risk of each customer. In order to generate the Index of Customer Value, we propose to construct a synthetic index using principal component analysis and multiple factorial analysis. 20 Minutes Breakout Ivan Atehortua
10082 - A Pseudo-Interactive Approach to Teaching SAS® Programming With the advent of the exciting new hybrid field of Data Science, programming and data management skills are in greater demand than ever and have never been easier to attain. Online resources like codecademy and w3schools offer a host of tutorials and assistance to those looking to develop their programming abilities and knowledge. Though their content is limited to languages and tools suited mostly for web developers, the value and quality of these sites are undeniable. To this end, similar tutorials for other free-to-use software applications are springing up. The interactivity of these tutorials elevates them above most, if not all, other out-of-classroom learning tools. The process of learning programming or a new language can be quite disjointed when trying to pair a textbook or similar walk-through material with matching coding tasks and problems. These sites unify these pieces for users by presenting them with a series of short, simple lessons that always require the user to demonstrate their understanding in a coding exercise before progressing. After teaching SAS® in a classroom environment, I became fascinated by the potential for a similar student-driven approach to learning SAS. This could afford me more time to provide individualized attention, as well as open up additional class time to more advanced topics. In this talk, I discuss my development of a series of SAS scripts that walk the user through learning the basics of SAS and that involve programming at every step of the process. This collection of scripts should serve as a self-contained, pseudo-interactive course in SAS basics that students could be asked to complete on their own in a few weeks, leaving the remainder of the term to be spent on more challenging, realistic tasks. 20 Minutes Breakout Hunter Glanz
10120 - Integrated Marketing with SAS® in the Age of Real Time Real-time, integrated marketing solutions are a necessity for maintaining your competitive advantage. This presentation provides a brief overview of three SAS products (SAS® Marketing Automation, SAS® Real-Time Decision Manager, and SAS® Event Stream Processing) that form a basis for building modern, real-time, interactive marketing solutions. It presents typical (and also possible) customer-use cases that you can implement with a comprehensive real-time interactive marketing solution, in major industries like finance (banking), telco, and retail. It demonstrates typical functional architectures that need to be implemented to support business cases (how solution components collaborate with customer’s IT landscape and with each other). And it provides examples of our experience in implementing these solutions—dos and don’ts, best practices, and what to expect from an implementation project. 50 Minutes Breakout Dmitriy Alergant
Marje Fecht
10160 - Design for Success: An Approach to Metadata Architecture for Distributed Visual Analytics Metadata is an integral and critical part of any environment. Metadata facilitates resource discovery and provides unique identification of every single digital component of a system, simple to complex. SAS® Visual Analytics, one of the most powerful analytics visualization platforms, leverages the power of metadata to provide a plethora of functionalities for all types of users. The possibilities range from real-time advanced analytics and power-user reporting to advanced deployment features for a robust and scalable distributed platform to internal and external users. This paper explains the best practices and advanced approaches for designing and managing metadata for a distributed global SAS Visual Analytics environment. Designing and building the architecture of such an environment requires attention to important factors like user groups and roles, access management, data protection, data volume control, performance requirements, and so on. This paper covers how to build a sustainable and scalable metadata architecture through a top-down hierarchical approach. It helps SAS Visual Analytics Data Administrators to improve the platform benchmark through memory mapping, perform administrative data load (AUTOLOAD, Unload, Reload-on-Start, and so on), monitor artifacts of distributed SAS® LASR™ Analytic Servers on co-located Hadoop Distributed File System (HDFS), optimize high-volume access via FullCopies, build customized FLEX themes, and so on. It showcases practical approaches to managing distributed SAS LASR Analytic Servers, offering guest access for global users, managing host accounts, enabling Mobile BI, using power-user reporting features, customizing formats, enabling home page customization, using best practices for environment migration, and much more. 20 Minutes Breakout Ratul Saha
Vignesh Balasubramanian
10180 - Creating and Sharing SAS® ODS Graphics with a Code Playground Based on Microsoft Office You've heard that SAS® Output Delivery System (ODS) Graphics provides a powerful and detailed syntax for creating custom graphs, but for whatever reason you still haven't added them to your bag of SAS® tricks. Let's change that! We will also present a code playground based on Microsoft Office that will enable you to quickly try out a variety of prepared SAS ODS Graphics examples, tweak the code, and see the results—all directly from Microsoft Excel. More experienced users will also find the code playground (which is similar in spirit to Google Code Playground or JSFiddle) useful for compiling SAS ODS Graphics code snippets for themselves and for sharing with colleagues, as well as for creating dashboards hosted by Microsoft Excel or Microsoft PowerPoint that contain precisely sized and placed SAS graphics. 20 Minutes Breakout Ted Conway
Zeke Torres
10200 - Using the SAS® Hash Object with Duplicate Key Entries By default, the SAS® hash object permits only entries whose keys, defined in its key portion, are unique. While in certain programming applications this is a rather utile feature, there are also others for which being able to insert and manipulate entries with duplicate keys is imperative. Such an ability, facilitated in SAS since SAS® 9.2, was a welcome development: It vastly expanded the functionality of the hash object and eliminated the necessity to work around the distinct-key limitation using custom code. However, nothing comes without a price, and the ability of the hash object to store duplicate key entries is no exception. In particular, additional hash object methods had to be—and were—developed to handle specific entries sharing the same key. The extra price is that using these methods is surely not quite as straightforward as the simple corresponding operations on distinct-key tables, and the documentation alone is a rather poor help for making them work in practice. Rather extensive experimentation and investigative coding is necessary to make that happen. This paper is a result of such endeavor, and hopefully, it will save those who delve into it a good deal of time and frustration. 50 Minutes Breakout Paul Dorfman
10220 - Correcting the Quasi-complete Separation Issue in Logistic Regression Models Abstract quasi-complete separation is a commonly detected issue in logit/probit models. Quasi-complete separation occurs when a dependent variable separates an independent variable or a combination of several independent variables to a certain degree. In other words, levels in a categorical variable or values in a numeric variable are separated by groups of discrete outcome variables. Most of the time, this happens in categorical independent variable(s). Quasi-complete separation can cause convergence failures in logistic regression, which consequently result in a large coefficient estimate and standard errors inflation. Therefore, it can potentially yield biased results. This paper provides comprehensive reviews with various approaches for correcting quasi-complete separation in binary logistic regression model and presents hands-on examples from the healthcare insurance industry. First, it introduces the concept of quasi–complete separation and how to diagnosis the issue. Then, it provides step-by-step guidelines for fixing the problem, from a straightforward data configuration approach to a complicated statistical modeling approach such as the EXACT method, the FIRTH method, and so on. 20 Minutes Breakout Xinghe Lu
10240 - Instant Interactive SAS® Log Window Analyzer An interactive SAS® environment is preferred for developing programs as it gives the flexibility of instantly viewing the log in the log window. The programmer must review the log window to ensure that each and every single line of a written program is running successfully without displaying any messages defined by SAS that are potential errors. Reviewing the log window every time is not only time consuming but also prone to manual error for any level of programmer. Just to confirm that the log is free from error, the programmer must check the log. Currently in the interactive SAS environment there is no way to get an instant notification about the generated log from the Log window, indicating whether there have been any messages defined by SAS that are potential errors. This paper introduces an instant approach to analyzing the Log window using the SAS macro %ICHECK that displays the reports instantly in the same SAS environment. The report produces a summary of all the messages defined by SAS in the Log window. The programmer does not need to add %ICHECK at the end of the program. Whether a single DATA step, a single PROC step, or the whole program is submitted, the %ICHECK macro is automatically executed at the end of every submission. It might be surprising to you to learn how a compiled macro can be executed without calling it in the Editor window. But it is possible with %ICHECK, and you can develop it using only SAS products. It can be used in a Windows or UNIX interactive SAS environment without requiring any user inputs. With the proposed approach, there is a significant benefit in the log review process and a 100% gain in time saved for all levels of programmers because the log is free from error. Similar functionality can be introduced in the SAS product itself. 10 Minutes Quick Tip PALANISAMY MOHAN
10260 - SAS® Macro for Generalized Method of Moments Estimation for Longitudinal Data with Time-Dependent Covariates Longitudinal data with time-dependent covariates is not readily analyzed as there are inherent, complex correlations due to the repeated measurements on the sampling unit and the feedback process between the covariates in one time period and the response in another. A generalized method of moments (GMM) logistic regression model (Lalonde, Wilson, and Yin 2014) is one method for analyzing such correlated binary data. While GMM can account for the correlation due to both of these factors, it is imperative to identify the appropriate estimating equations in the model. Cai and Wilson (2015) developed a SAS® macro using SAS/IML® software to fit GMM logistic regression models with extended classifications. In this paper, we expand the use of this macro to allow for continuous responses and as many repeated time points and predictors as possible. We demonstrate the use of the macro through two examples, one with binary response and another with continuous response. 20 Minutes Breakout Katherine Cai
10340 - Agile BI: How Eandis is using SAS® Visual Analytics for Energy Grid Management Eandis is a rapidly growing energy distribution grid operator in the heart of Europe, with requirements to manage power distribution on behalf of 229 municipalities in Belgium. With a legacy SAP data warehouse and other diverse data sources, business leaders at Eandis faced challenges with timely analysis of key issues such as power quality, investment planning, and asset management. To face those challenges, a new agile way of thinking about Business Intelligence (BI) was necessary. A sandbox environment was introduced where business key-users could explore and manipulate data. It allowed them to have approachable analytics and to build prototypes. Many pitfalls appeared and the greatest challenge was the change in mindset for both IT and business users. This presentation addresses those issues and possible solutions. 50 Minutes Breakout Olivier Goethals
10360 - Nine Frequently Asked Questions about Getting Started with SAS® Visual Analytics You’ve heard all the talk about SAS® Visual Analytics—but maybe you are still confused about how the product would work in your SAS® environment. Many customers have the same points of confusion about what they need to do with their data, how to get data into the product, how SAS Visual Analytics would benefit them, and even should they be considering Hadoop or the cloud. In this paper, we cover the questions we are asked most often about implementation, administration, and usage of SAS Visual Analytics. 50 Minutes Breakout Tricia Aanderud
Ryan Kumpfmiller
Nick Welke
10381 - Pastries, Microbreweries, Diamonds, and More: Small Businesses Can Profit with SAS® Today, there are 28 million small businesses, which account for 54% of all sales in the United States. The challenge is that small businesses struggle every day to accurately forecast future sales. These forecasts not only drive investment decisions in the business, but also are used in setting daily par, determining labor hours, and scheduling operating hours. In general, owners use their gut instinct. Using SAS® provides the opportunity to develop accurate and robust models that can unlock costs for small business owners in a short amount of time. This research examines over 5,000 records from the first year of daily sales data for a start-up small business, while comparing the four basic forecasting models within SAS® Enterprise Guide®. The objective of this model comparison is to demonstrate how quick and easy it is to forecast small business sales using SAS Enterprise Guide. What does that mean for small businesses? More profit. SAS provides cost-effective models for small businesses to better forecast sales, resulting in better business decisions. 30 Minutes E-Poster Cameron Jagoe
10400 - Designed to Fail: Approximately Right vs Precisely Wrong Electrolux is one of the largest appliance manufacturers in the world. Electrolux North America sells more than 2,000 products to end consumers through 9,000 business customers. To grow and increase profitability under challenging market conditions, Electrolux partnered with SAS® to implement an integrated platform for SAS® for Demand-Driven Planning and Optimization and improve service levels to its customers. The process uses historical order data to create a statistical monthly forecast. The Electrolux team then reviews the statistical forecast in SAS® Collaborative Planning Workbench, where they can add value based on their business insights and promotional information. This improved monthly forecast is broken down to the weekly level where it flows into SAS® Inventory Optimization Workbench. SAS Inventory Optimization Workbench then computes weekly inventory targets to satisfy the forecasted demand at the desired service level. This presentation also covers how Electrolux implemented this project. Prior to the commencement of the project, Electrolux and the SAS team jointly worked to quantify the value of the project and set the right expectations with the executive team. A detailed timeline with regular updates helped provide visibility to all stake holders. Finally, a clear change management strategy was also developed to define the roles and responsibilities after the implementation of SAS for Demand-Driven Planning and Optimization. 20 Minutes Breakout Aaron Raymond
Pratapsinh Patil
10401 - Responsible Gambling Model at Veikkaus Our company Veikkaus is a state-owned gambling and lottery company in Finland that has a national legalized monopoly for gambling. All the profit we make goes back to Finnish society (for art, sports, science, and culture), and this is done by our government. In addition to the government's requirements of profit, the state (Finland) also requires us to handle the adverse social aspects of gaming, such as problem gambling. The challenge in our business is to balance between these two factors. For the purposes of problem gambling, we have used SAS® tools to create a responsible gaming tool, called VasA, based on a logistic regression model. The name VasA is derived from the Finnish words for "Responsible Customership." The model identifies problem gamblers from our customer database using the data from identified gaming, money transfers, web behavior, and customer data. The variables that were used in the model are based on the theory behind the problem gambling. Our actions for problem gambling include, for example, different CRM and personalization of a customer's website in our web service. There were several companies who provided responsible gambling tools as such for us to buy, but we wanted to create our own for two reasons. Firstly, we wanted it to include our whole customer database, meaning all our customers and not just those customers who wanted to take part in it. These other tools normally include only customers who want to take part. The other reason was that we saved a ridiculous amount of money by doing it by ourselves compared to having to buy one. During this process, SAS played a big role, from gathering the data to the construction of the tool, and from modeling to creating the VasA variables, then on to the database, and finally to the analyses and reporting. 20 Minutes Breakout Tero Kallioniemi
10404 - Put Your Data on the Map: Using the GEOCODE and GMAP Procedures to Create Bubble Maps in SAS® A bubble map is a useful tool for identifying trends and visualizing the geographic proximity and intensity of events. This session describes how to use readily available map data sets in SAS® along with PROC GEOCODE and PROC GMAP to turn a data set of addresses and events into a bubble map of the United States with scaled bubbles depicting the location and intensity of events. 20 Minutes Breakout Caroline Walker
10420 - On The Fly SAS® Reports In today’s fast paced work environment, time management is crucial to the success of the project. Sending requests to SAS® programmers to run reports every time you need to get the most current data can be a stretch sometimes on an already strained schedule. Why bother to contact the programmer? Why not build the execution of the SAS program into the report itself so that when the report is launched, real-time data is retrieved and the report shows the most recent data. This paper demonstrates that by opening an existing SAS report in Microsoft Word or Microsoft Excel, the data in the report refreshes automatically. Simple Visual Basic for Applications (VBA) code is written in Word or Excel. When an existing SAS report is opened, this VBA code calls the SAS programs that create the report from within a Microsoft Office product and overwrites the existing report data with the most current data. 10 Minutes Quick Tip Ron Palanca
10460 - Missing Values: They Are NOT Nothing When analyzing data with SAS®, we often encounter missing or null values in data. Missing values can arise from the availability, collectibility, or other issues with the data. They represent the imperfect nature of real data. Under most circumstances, we need to clean, filter, separate, impute, or investigate the missing values in data. These processes can take up a lot of time, and they are annoying. For these reasons, missing values are usually unwelcome and need to be avoided in data analysis. There are two sides to every coin, however. If we can think outside the box, we can take advantage of the negative features of missing values for positive uses. Sometimes, we can create and use missing values to achieve our particular goals in data manipulation and analysis. These approaches can make data analyses convenient and improve work efficiency for SAS programming. This kind of creative and critical thinking is the most valuable quality for data analysts. This paper exploits real-world examples to demonstrate the creative uses of missing values in data analysis and SAS programming, and discusses the advantages and disadvantages of these methods and approaches. The illustrated methods and advanced programming skills can be used in a wide variety of data analysis and business analytics fields. 20 Minutes Breakout Justin Jia
Amanda Lin
10480 - Architecture and Deployment of SAS® Visual Analytics 7.2 with a Distributed SAS® LASR™ Analytic Server for Showcasing Reports to the Public This paper demonstrates the deployment of SAS® Visual Analytics 7.2 with a distributed SAS® LASR™ Analytic Server and an internet-facing web tier. The key factor in this deployment is to establish the secure web server connection using a third-party certificate to perform the client and server authentication. The deployment process involves the following steps: 1) Establish the analytics cluster, which consists of SAS® High-Performance Deployment of Hadoop and the deployment of high-performance analytics environment master and data nodes. 2) Get the third-party signed certificate and the key files. 3) Deploy the SAS Visual Analytics server tier and middle tier. 4) Deploy the standalone web tier with HTTP protocol configured using secure sockets. 5) Deploy the SAS® Web Infrastructure Platform. 6) Perform post-installation validation and configuration to handle the certificate between the servers. 20 Minutes Breakout Vimal Raj Arockiasamy
Ratul Saha
10481 - Product Purchase Sequence Analyses by Using a Horizontal Data Sorting Technique Horizontal data sorting is a very useful SAS® technique in advanced data analysis when you are using SAS programming. Two years ago (SAS® Global Forum Paper 376-2013), we presented and illustrated various methods and approaches to perform horizontal data sorting, and we demonstrated its valuable application in strategic data reporting. However, this technique can also be used as a creative analytic method in advanced business analytics. This paper presents and discusses its innovative and insightful applications in product purchase sequence analyses such as product opening sequence analysis, product affinity analysis, next best offer analysis, time-span analysis, and so on. Compared to other analytic approaches, the horizontal data sorting technique has the distinct advantages of being straightforward, simple, and convenient to use. This technique also produces easy-to-interpret analytic results. Therefore, the technique can have a wide variety of applications in customer data analysis and business analytics fields. 30 Minutes E-Poster Justin Jia
Amanda Lin
10540 - Bridging the Gap: Importing Health Indicators Warehouse Data into SAS® Visual Analytics Using SAS® Stored Processes and APIs The Health Indicators Warehouse (HIW) is part of the US Department of Health and Human Services’ (DHHS) response to make federal data more accessible. Through it, users can access data and metadata for over 1,200 indicators from approximately 180 federal and nonfederal sources. The HIW also supports data access by applications such as SAS® Visual Analytics through the use of an application programming interface (API).  An API serves as a communication interface for integration.  As a result of the API, HIW data consumers using SAS Visual Analytics can avoid difficult manual data processing. This paper provides detailed information about how to access HIW data with SAS Visual Analytics in order to produce easily understood visualizations with minimal effort through a methodology that automates HIW data processing. This paper also shows how to run SAS® macros inside a stored process to make HIW data available in SAS Visual Analytics for exploration and reporting via API calls; the SAS macros are provided. Use cases involving dashboards are also examined in order to demonstrate the value of streaming data directly from the HIW. Both IT professionals and population health analysts will benefit from understanding how to import HIW data into SAS Visual Analytics using SAS® Stored Processes, macros, and APIs.  This can be very helpful to organizations that want to lower maintenance costs associated with data management while gaining insights into health data with visualizations.  This paper provides a starting point for any organization interested in deriving full value from SAS Visual Analytics while augmenting their work with HIW data. 50 Minutes Breakout Li Hui Chen
Manuel Figallo
10560 - No Perm Space, No Problem! Using SAS® and SAS/ACCESS® to Submit Multi-Step Queries with Teradata Temporary Tables As an analyst with little to no perm space in your Teradata user account, do you think your analyses are limited to a single query submission at a time in Teradata from SAS®? If so, think again! With the use of Teradata temporary volatile tables you can greatly enhance your analyses by submitting multi-step queries from SAS for processing in Teradata using the temporary volatile tables to store interim query step results. You can do this while returning only the final results you need for your analyses. Using this approach, analysts can spend more time focusing on important analyses than on cumbersome data prep steps to attain analysis results. Teradata temporary volatile tables are reviewed and an example illustrating their use in a multi-step query is presented using SAS/ACCESS® to Teradata and explicit SQL pass-through to submit Teradata SQL from PROC SQL calls in SAS. 50 Minutes Table Talk Laurie Kudla
10561 - Making it Happen: A Novel Way to Save Taxpayer Dollars by Implementing an In-House SAS® Data Analytics and Research Center As part of promoting a data-driven culture and data analytics modernization at its federal sector clientele, Northrop Grumman developed a framework for designing and implementing an in-house Data Analytics and Research Center (DAARC) using a SAS® set of tools. This DAARC provides a complete set of SAS® Enterprise BI (Business Intelligence) and SAS® Data Management tools. The platform can be used for data research, evaluations, and analysis and reviews by federal agencies such as the Social Security Administration (SSA), the Center for Medicare and Medicaid Services (CMS), and others. DAARC architecture is based on a SAS data analytics platform with newer capabilities of data mining, forecasting, visual analytics, and data integration using SAS® Business Intelligence. These capabilities enable developers, researchers, and analysts to explore big data sets with varied data sources, create predictive models, and perform advanced analytics including forecasting, anomaly detection, use of dashboards, and creating online reports. The DAARC framework that Northrop Grumman developed enables agencies to implement a self-sufficient "analytics as a service" approach to meet their business goals by making informed and proactive data-driven decisions. This paper provides a detailed approach to how the DAARC framework was established in strong partnership with federal customers of Northrop Grumman. This paper also discusses the best practices that were adopted for implementing specific business use cases in order to save tax-payer dollars through many research-related analytical and statistical initiatives that continue to use this platform. 50 Minutes Breakout vivek sethunatesan
10600 - You Can Bet on It: Missing Observations Are Preserved with the PRELOADFMT and COMPLETETYPES Options Do you write reports that sometimes have missing categories across all class variables? Some programmers write all sorts of additional DATA step code in order to show the zeros for the missing rows or columns. Did you ever wonder whether there is an easier way to accomplish this? PROC MEANS and PROC TABULATE, in conjunction with PROC FORMAT, can handle this situation with a couple of powerful options. With PROC TABULATE, we can use the PRELOADFMT and PRINTMISS options in conjunction with a user-defined format in PROC FORMAT to accomplish this task. With PROC SUMMARY, we can use the COMPLETETYPES option to get all the rows with zeros. This paper uses examples from Census Bureau tabulations to illustrate the use of these procedures and options to preserve missing rows or columns. 10 Minutes Quick Tip Janet Wysocki
10620 - Strategies for Developing and Delivering an Effective Online SAS® Programming Course Teaching online courses is very different from teaching in the classroom setting. Developing and delivering an effective online class entails more than just transferring traditional course materials into written documents and posting them in a course shell. This paper discusses the author’s experience in converting a traditional hands-on introductory SAS® programming class into an online course and presents some ideas for promoting successful learning and knowledge transfer when teaching online. 20 Minutes Breakout Justina Flavin
10621 - Risk Adjustment Methods in Value-Based Reimbursement Strategies Value-based reimbursement is the emerging strategy in the US healthcare system. The premise of value-based care is simple in concept—high quality and low cost provides the greatest value to patients and the various parties that fund their coverage. The basic equation for value is equally simple to compute: value=quality/cost. However, there are significant challenges to measuring it accurately. Error or bias in measuring value could result in the failure of this strategy to ultimately improve the healthcare system. This session discusses various methods and issues with risk adjustment in a value-based reimbursement model. Risk adjustment is an essential tool for ensuring that fair comparisons are made when deciding what health services and health providers have high value. The goal this presentation is to give analysts an overview of risk adjustment and to provide guidance for when, why, and how to use risk adjustment when quantifying performance of health services and healthcare providers on both cost and quality. Statistical modeling approaches are reviewed and practical issues with developing and implementing the models are discussed. Real-world examples are also provided. 50 Minutes Breakout Daryl Wansink
10640 - Optimizing Airline Pilot Connection Time Using PROC REG and PROC LOGISTIC As any airline traveler knows, connection time is a key element of the travel experience. A tight connection time can cause angst and concern, while a lengthy connection time can introduce boredom and a longer than desired travel time. The same elements apply when constructing schedules for airline pilots. Like passengers, pilot schedules are built with connections. Delta Air Lines operates a hub and spoke system that feeds both passengers and pilots from the spoke stations and connects them through the hub stations. Pilot connection times that are tight can result in operational disruptions, whereas extended pilot connection times are inefficient and unnecessarily costly. This paper demonstrates how Delta Air Lines used SAS® PROC REG and PROC LOGISTIC to analyze historical data in order to build operationally robust and financially responsible pilot connections. 20 Minutes Breakout Andy Hummel
10641 - The Path Length: Parent-Child De-lineage with PROC TREE and ODS The SAS® procedure PROC TREE sketches parent-child lineage—also known as trees—from hierarchical data. Hierarchical relationships can be difficult to flatten out into a data set, but with PROC TREE, its accompanying ODS table TREELISTING, and some creative yet simple handcrafting, a de-lineage of parent-children variables can be derived. Because the focus of PROC TREE is to provide the tree structure in graphical form, it does not explicitly output the depth of the tree, although the depth is visualized in the accompanying graph. Perhaps unknown to most, the path length variable, or simply the height of the tree, can be extracted from PROC TREE merely by capturing it from the ODS output, as demonstrated in this paper. 10 Minutes Quick Tip Can Tongur
10662 - Avoid Change Control by Using Control Tables Developers working on a production process need to think carefully about ways to avoid future changes that require change control, so it's always important to make the code dynamic rather than hardcoding items into the code.  Even if you are a seasoned programmer, the hardcoded items might not always be apparent. This paper assists in identifying the harder-to-reach hardcoded items and addresses ways to effectively use control tables within the SAS® software tools to deal with sticky areas of coding such as formats, parameters, grouping/hierarchies, and standardization. The paper presents examples of several ways to use the control tables and demonstrates why this usage prevents the need for coding changes. Practical applications are used to illustrate these examples. 20 Minutes Breakout Frank Ferriola
10663 - Applying Frequentist and Bayesian Logistic Regression to MOOC data in SAS®: a Case Study Massive Open Online Courses (MOOC) have attracted increasing attention in educational data mining research areas. MOOC platforms provide free higher education courses to Internet users worldwide. However, MOOCs have high enrollment but notoriously low completion rates. The goal of this study is apply frequentist and Bayesian logistic regression to investigate whether and how students’ engagement, intentions, education levels, and other demographics are conducive to course completion in MOOC platforms. The original data used in this study came from an online eight-week course titled “Big Data in Education,” taught within the Coursera platform (MOOC) by Teachers College, Columbia University. The data sets for analysis were created from three different sources—clickstream data, a pre-course survey, and homework assignment files. The SAS system provides multiple procedures to perform logistic regression, with each procedure having different features and functions. In this study, we apply two approaches—frequentist and Bayesian logistic regression, to MOOC data. PROC LOGISTIC is used for the frequentist approach, and PROC GENMOD is used for Bayesian analysis. The results obtained from the two approaches are compared.  All the statistical analyses are conducted in SAS® 9.3. Our preliminary results show that MOOC students with higher course engagement and higher motivation are more likely to complete the MOOC course. 30 Minutes E-Poster Yan Zhang
10680 - Key Features in ODS Graphics for Efficient Clinical Graphing High-quality effective graphs not only enhance understanding of the data but also facilitate regulators in the review and approval process. In recent SAS® releases, SAS has made significant progress toward more efficient graphing in ODS Statistical Graphics (SG) procedures and Graph Template Language (GTL). A variety of graphs can be quickly produced using convenient built-in options in SG procedures. With graphical examples and comparison between SG procedures and traditional SAS/GRAPH® procedures in reporting clinical trial data, this paper highlights several key features in ODS Graphics to efficiently produce sophisticated statistical graphs with more flexible and dynamic control of graphical presentation including: 1) Better control of axes in different scales and intervals; 2) Flexible ways to control graph appearance; 3) Plots overlay in single-cell or multi-cell graphs; 4) Enhanced annotation; 5) Classification panel of multiple plots with individualized labeling. 10 Minutes Quick Tip Yuxin (Ellen) Jiang
10701 - Running Projects for the Average Joe This paper explores some proven methods used to automate complex SAS® Enterprise Guide® projects so that the average Joe can run them with little or no prior experience.  There are often times when a programmer is requested to extract data and dump it into Microsoft Excel for a user.  Often these data extracts are very similar and can be run with previously saved code.  However, the user quite often has to wait for the programmer to have the time to simply run the code.  By automating the code, the programmer regains control over their data requests.  This paper discusses the benefits of establishing macro variables and creating stored procedures, among other tips 10 Minutes Quick Tip Jennifer Davies
10721 - Can You Stop the Guesswork in Your Marketing Budget Allocation? Marketing Mixed Modeling Using SAS® Can Help! Even though marketing is inevitable in every business, every year the marketing budget is limited and prudent fund allocations are required to optimize marketing investment. In many businesses, the marketing fund is allocated based on the marketing manager’s experience, departmental budget allocation rules, and sometimes "gut feelings" of business leaders. Those traditional ways of budget allocation yield suboptimal results and in many cases lead to money being wasted on irrelevant marketing efforts. Marketing mixed models can be used to understand the effects of marketing activities and identify the key marketing efforts that drive the most sales among a group of competing marketing activities. The results can be used in marketing budget allocation to take out the guesswork that typically goes into the budget allocation. In this paper, we illustrate practical methods for developing and implementing marketing mixed modeling using SAS® procedures. Real-life challenges of marketing mixed model development and execution are discussed, and several recommendations are provided to overcome some of those challenges. 20 Minutes Breakout Delali Agbenyegah
10722 - Detecting Phishing Attempts with SAS®: Minimally Invasive Email Log Data Phishing is the attempt of a malicious entity to acquire personal, financial, or otherwise sensitive information such as user names and passwords from recipients through the transmission of seemingly legitimate emails. By quickly alerting recipients of known phishing attacks, an organization can reduce the likelihood that a user will succumb to the request and unknowingly provide sensitive information to attackers. Methods to detect phishing attacks typically require the body of each email to be analyzed. However, most academic institutions do not have the resources to scan individual emails as they are received, nor do they wish to retain and analyze message body data. Many institutions simply rely on the education and participation of recipients within their network. Recipients are encouraged to alert information security (IS) personnel of potential attacks as they are delivered to their mailboxes. This paper explores a novel and more automated approach that uses SAS® to examine email header and transmission data to determine likely phishing attempts that can be further analyzed by IS personnel. Previously a collection of 2,703 emails from an external filtering appliance were examined with moderate success. This paper focuses on the gains from analyzing an additional 50,000 emails, with the inclusion of an additional 30 known attacks. Real-time email traffic is exported from Splunk Enterprise into SAS for analysis. The resulting model aids in determining the effectiveness of alerting IS personnel to potential phishing attempts faster than a user simply forwarding a suspicious email to IS personnel. 30 Minutes E-Poster Taylor Anderson
10723 - SAS-IO, a Browser-Based Tool to Automate Creation of Inputs/Outputs Lists for SAS® Programs High-quality documentation of SAS® code is standard practice in multi-user environments for smoother group collaborations. One of the documentation items that facilitate program sharing and retrospective review is a header section at the beginning of a SAS program highlighting the main features of the program, such as the program’s name, its creation date, the program’s aims, the programmer’s identification, and the project title. In this header section, it is helpful to keep a list of the inputs and outputs of the SAS program (for example, SAS data sets and files that the program used and created). This paper introduces SAS-IO, a browser-based HTML/JavaScript tool that can automate production of such an input/output list. This can save the programmers’ time, especially when working with long SAS programs. 30 Minutes E-Poster Mohammad Reza Rezai
10725 - Using SAS® to Implement Simultaneous Linking in Item Response Theory The objective of this study is to use the GLM procedure in SAS® to solve a complex linkage problem with multiple test forms in educational research. Typically, the ABSORB option in the GLM procedure makes this task relatively easy to implement. Note that for educational assessments, to apply one-dimensional combinations of two-parameter logistic (2PL) models (Hambleton, Swaminathan, and Rogers 1991, ch. 1) and generalized partial credit models (Muraki 1997) to a large-scale high-stakes testing program with very frequent administrations requires a practical approach to link test forms. Haberman (2009) suggested a pragmatic solution of simultaneous linking to solve the challenging linking problem. In this solution, many separately calibrated test forms are linked by the use of least-squares methods. In SAS, the GLM procedure can be used to implement this algorithm by the use of the ABSORB option for the variable that specifies administrations, as long as the data are sorted by order of administration. This paper presents the use of SAS to examine the application of this proposed methodology to a simple case of real data. 20 Minutes Breakout Lili Yao
Jun Xu
10727 - Using SAS® Geocoding as a Form of Field Interview Verification The Add Health Parent Study is using a new and innovative method to augment our other interview verification strategies. Typical verification strategies include calling respondents to ask questions about their interview, recording pieces of interaction (CARI - Computer Aided Recorded Interview), and analyzing timing data to see that each interview was within a reasonable length. Geocoding adds another tool to the toolbox for verifications. By applying street-level geocoding to the address where an interview is reported to be conducted and comparing that to a captured latitude/longitude reading from a GPS tracking device, we are able to compute the distance between two points. If that distance is very small and time stamps are close to each other, then the evidence points to the field interviewer being present at the respondent’s address during the interview. For our project, the street-level geocoding to an address is done using SAS® PROC GEOCODE. Our paper describes how to obtain a US address database from the SAS website and how it can be used in PROC GEOCODE. We also briefly compare this technique to using the Google Map API and Python as an alternative. 20 Minutes Breakout Chris Carson
10728 - A Silver Lining in the Cloud: Deployment of SAS® Visual Analytics 7.2 on AWS Amazon Web Services (AWS) as a platform for analytics and data warehousing has gained significant adoption over the years. With SAS® Visual Analytics being one of the preferred tools for data visualization and analytics, it is imperative to be able to deploy SAS Visual Analytics on AWS. This ensures swift analysis and reporting on large amounts of data with SAS Visual Analytics by minimizing the movement of data across environments. This paper focuses on installing SAS Visual Analytics 7.2 in an Amazon Web Services environment, migration of metadata objects and content from previous versions to the deployment on the cloud, and ensuring data security. 20 Minutes Breakout Vimal Raj Arockiasamy
Rajesh Inbasekaran
10729 - Using SAS® to Conduct Multivariate Statistical Analysis in Educational Research: Exploratory Factor Analysis and Confirmatory Factor Analysis Multivariate statistical analysis plays an increasingly important role as the number of variables being measured increases in educational research. In both cognitive and noncognitive assessments, many instruments that researchers aim to study contain a large number of variables, with each measured variable assigned to a specific factor of the bigger construct. Recalling the educational theories or empirical research, the factor of each instrument usually emerges the same way. Two types of factor analysis are widely used in order to understand the latent relationships among these variables based on different scenarios. (1) Exploratory factor analysis (EFA), which is performed by using the SAS® procedure PROC FACTOR, is an advanced statistical method used to probe deeply into the relationship among the variables and the larger construct and then develop a customized model for the specific assessment. (2) When a model is established, confirmatory factor analysis (CFA) is conducted by using the SAS procedure PROC CALIS to examine the model fit of specific data and then make adjustments for the model as needed. This paper presents the application of SAS to conduct these two types of factor analysis to fulfill various research purposes. Examples using real noncognitive assessment data are demonstrated, and the interpretation of the fit statistics is discussed. 20 Minutes Breakout Jun Xu
Lili Yao
10740 - Developing an On-Demand Web Report Platform Using Stored Processes and SAS® Web Application Server As SAS® programmers, we often develop listings, graphs, and reports that need to be delivered frequently to our customers. We might decide to manually run the program every time we get a request, or we might easily schedule an automatic task to send a report at a specific date and time. Both scenarios have some disadvantages. If the report is manual, we have to find and run the program every time someone request an updated version of the output. It takes some time and it is not the most interesting part of the job. If we schedule an automatic task in Windows, we still sometimes get an email from the customers because they need the report immediately. That means that we have to find and run the program for them. This paper explains how we developed an on-demand report platform using SAS® Enterprise Guide®, SAS® Web Application Server, and stored processes. We had developed many reports for different customer groups, and we were getting more and more emails from them asking for updated versions of their reports. We felt we were not using our time wisely and decided to create an infrastructure where users could easily run their programs through a web interface. The tool that we created enables SAS programmers to easily release on-demand web reports with minimum programming. It has web interfaces developed using stored processes for the administrative tasks, and it also automatically customizes the front end based on the user who connects to the website. One of the challenges of the project was that certain reports had to be available to a specific group of users only. 20 Minutes Breakout Romain Miralles
10741 - Using SAS® Comments to Run SAS Code in Parallel Our daily work in SAS® involves manipulation of many independent data sets, and a lot of time can be saved if independent data sets can be manipulated simultaneously. This paper presents our interface RunParallel, which opens multiple SAS sessions and controls which SAS procedures and DATA steps to run on which sessions by parsing comments such as /*EP.SINGLE*/ and /*EP.END*/. The user can easily parallelize any code by simply wrapping procedure steps and DATA steps in such comments and executing in RunParallel. The original structure of the SAS code is preserved so that it can be developed and run in serial regardless of the RunParallel comments. When applied in SAS programs with many external data sources and heavy computations, RunParallel can give major performance boosts. Among our examples we include a simulation that demonstrates how to run DATA steps in parallel, where the performance gain greatly outweighs the single minute it takes to add RunParallel comments to the code. In a world full a big data, a lot of time can be saved by running in parallel in a comprehensive way. 50 Minutes Breakout Jingyu She
Tomislav Kajinic
10742 - Building a Recommender System with SAS® to Improve Cross-Selling for Online Retailers Nowadays, the recommender system is a popular tool for online retailer businesses to predict customers’ next-product-to-buy (NPTB). Based on statistical techniques and the information collected by the retailer, an efficient recommender system can suggest a meaningful NPTB to customers. A useful suggestion can reduce the customer’s searching time for a wanted product and improve the buying experience, thus increasing the chance of cross-selling for online retailers and helping them build customer loyalty. Within a recommender system, the combination of advanced statistical techniques with available information (such as customer profiles, product attributes, and popular products) is the key element in using the retailer’s database to produce a useful suggestion of an NPTB for customers. This paper illustrates how to create a recommender system with the SAS® RECOMMEND procedure for online business. Using the recommender system, we can produce predictions, compare the performance of different predictive models (such as decision trees or multinomial discrete-choice models), and make business-oriented recommendations from the analysis. 20 Minutes Breakout Shanshan Cong
10760 - Visualizing Eye-Tracking Data with SAS®: Creating Heat Maps on Images Few data visualizations are as striking or as useful as heat maps, and fortunately there are many applications for them. A heat map displaying eye-tracking data is a particularly potent example: the intensity of a viewer’s gaze is quantified and superimposed over the image being viewed, and the resulting data display is often stunning and informative. This paper shows how to use Base SAS® to prepare, smooth, and transform eye-tracking data and ultimately render it on top of a corresponding image. By customizing a graphical template and using specific Graph Template Language (GTL) options, a heat map can be drawn precisely so that the user maintains pixel-level control. In this talk, and in the related paper, eye-tracking data is used primarily, but the techniques provided are easily adapted to other fields such as sports, weather, and marketing. 30 Minutes E-Poster Matthew Duchnowski
10761 - Medicare Fraud Analytics Using Cluster Analysis: How PROC FASTCLUS Can Refine the Identification of Peer Comparison Groups Although limited to a small fraction of health care providers, the existence and magnitude of fraud in health insurance programs requires the use of fraud prevention and detection procedures. Data mining methods are used to uncover odd billing patterns in large databases of health claims history. Efficient fraud discovery can involve the preliminary step of deploying automated outlier detection techniques in order to classify identified outliers as potential fraud before an in-depth investigation. An essential component of the outlier detection procedure is the identification of proper peer comparison groups to classify providers as within-the-norm or outliers. This study refines the concept of peer comparison group within the provider category and considers the possibility of distinct billing patterns associated with medical or surgical procedure codes identifiable by the Berenson-Eggers Type of System (BETOS). “The BETOS system covers all HCPCS codes (Health Care Procedure Coding System); assigns a HCPCS code to only one BETOS code; consists of readily understood clinical categories; consists of categories that permit objective assignment…” (Center for Medicare and Medicaid Services, CMS). The study focuses on the specialty “General Practice” and involves two steps: first, the identification of clusters of similar BETOS-based billing patterns; and second, the assessment of the effectiveness of these peer comparison groups in identifying outliers. The working data set is a sample of the summary of 2012 data of physicians active in health care government programs made publicly available by the CMS through its website. The analysis uses PROC FASTCLUS, the SAS® cubic clustering criterion approach, to find the optimal number of clusters in the data. It also uses PROC ROBUSTREG to implement a multivariate adaptive threshold outlier detection method. 20 Minutes Breakout Paulo Macedo
10780 - Increasing Efficiency by Parallel Processing Working with big data is often time consuming and challenging. The primary goal in programming is to maximize throughputs while minimizing the use of computer processing time, real time, and programmers’ time. By using the Multiprocessing (MP) CONNECT method on a symmetric multiprocessing (SMP) computer, a programmer can divide a job into independent tasks and execute the tasks as threads in parallel on several processors. This paper demonstrates the development and application of a parallel processing program on a large amount of health-care data. 30 Minutes E-Poster Shuhua Liang
10821 - Four Lines of Code: Using Merge and Colons to Construct Historical Data from Status Tables Collection of customer data is often done in status tables or snapshots, where, for example, for each month, the values for a handful of variables are recorded in a new status table whose name is marked with the value of the month. In this QuickTip, we present how to construct a table of last occurrence times for customers using a DATA step merge of such status tables and the colon (":") wildcard. If the status tables are sorted, this can be accomplished in four lines of code (where RUN; is the fourth). Also, we look at how to construct delta tables (for example, from one time period to another, or which customers have arrived or left) using a similar method of merge and colons. 10 Minutes Quick Tip Jingyu She
10840 - How to Speed Up Your Validation Process Without Really Trying This paper provides tips and techniques to speed up the validation process without and with automation. For validation without automation, it introduces both standard use and clever use of options and statements to be implemented in the COMPARE procedure that can speed up the validation process. For validation with automation, a macro named %QCDATA is introduced for individual data set validation, and a macro named %QCDIR is introduced for comparison of data sets in two different directories. Also introduced in this section is a definition of &SYSINFO and an explanation of how it can be of use to interpret the result of the comparison. 50 Minutes Breakout Alice Cheng
Justina Flavin
10841 - Data Review Listings on Auto-Pilot: Using SAS® and Windows Server to Automate Reports and Flag Incremental Data Records During the course of a clinical trial study, large numbers of new and modified data records are received on an ongoing basis. Providing end users with an approach to continuously review and monitor study data, while enabling them to focus reviews on new or modified (incremental) data records, allows for greater efficiency in identifying potential data issues. In addition, supplying data reviewers with a familiar machine-readable output format (for example, Microsoft Excel) allows for greater flexibility in filtering, highlighting, and retention of data reviewers’ comments. In this paper, we outline an approach using SAS® in a Windows server environment and a shared folder structure to automatically refresh data review listings. Upon each execution, the listings are compared against previously reviewed data to flag new and modified records, as well as carry forward any data reviewers’ comments made during the previous review. In addition, we highlight the use and capabilities of the SAS® ExcelXP tagset, which enables greater control over data formatting, including management of Microsoft Excel’s sometimes undesired automatic formatting. Overall, this approach provides a significantly improved end-user experience above and beyond the more traditional approach of performing cumulative or incremental data reviews using PDF listings. 20 Minutes Breakout Victor Lopez
Get More Results