Search This Blog

Friday, September 22, 2017

Introduction to Insurance Domain / Industry

Insurance is a complicated and intricate mechanism where risks are transferred from one individual to a group and loss is shared on equitable basis by all members of the group. This blog discusses in brief about few core functions / operations related to Insurance industry.

Insurance Types
There are in general two types of insurance (1) Life Insurance (2) General Insurance.
Life Insurance can be divided into two classes - (a) provide pure life insurance protection, called term insurance (b) includes a saving or investment element

Few of the common General insurances are - Motor Insurance, Property Insurance, Personal Accident Insurance, Medical and Health Insurance, Travel Insurance, Credit Insurance, Third Party Insurance, Liability Insurance.

Key Insurance Functions
It is always better to know some unique facets of insurance company operations. The unique nature of insurance sector requires below specialized functions that do not exist in other businesses.

Ratemaking / Insurance Pricing
  • The process of predicting future losses and future expenses and allocating these costs among large mass of insured is called ratemaking. The price of insurance is always based on prediction(s)
  • A major part of Ratemaking is to identify all characteristics that can predict future losses and accordingly adjust premiums to different risk groups
  • Insurance rates are subjected to government regulation. It must not be excessive, discriminatory and should be fairly stable over a period of time
  • Rate of insurance and Premium paid by insured are two different things. Usually premium income of the insurer is sufficient to cover future losses and expenses
  • Important terms or formulas to understand - pure premium, gross premium, gross rate, loading, expected loss ratio
  • There are two types of rates - (a) class rates (b) Individual rates. Class rates are common and applied across large number of people in an area whereas Individual rates are applied for an individual on basis of Judgement, Schedule, Experience and / or Retrospective rates

  • It is related to sales and marketing department of insurance. Agent who sells insurance are frequently referred to as producers
  • It is the responsibility of this department to select and appoint agents and assists in sales. It renders technical assistance to agents

  • It is the process of selecting, classifying and pricing applicants for insurance
  • The ratemaking function is performed by actuaries. Actuaries set the insurance rate based on specific variable(s), while underwriters decide which variables apply to a specific insurance applicant
  • In the life insurance field, applicants may be classified as standard, preferred, substandard, and uninsurable
  • The primary objective of underwriting is to attain an underwriting profit. The objective is to produce a profitable books of business

Claim Settlement
  • Basic objectives of any claim settlement are - (a) verification of a covered loss (b) fair and prompt payment of claims (c) personal assistance to the insured
  • Adjuster are individuals who investigates losses. They determine the liability and the amount of payment to be made
  • Pay or Contest are the two basic options company have when confronted with a claim
  • Claim settlement procedure follows simple steps like, (1) Notice of loss (2) Investigation (3) Proof of Loss (4) Payment or denial of claim

Miscellaneous other functions in insurance domain are Investment, Legal, Accounting, Billing, etc.

Methods of insurance
  • Co-insurance – risks shared between insurers
  • Dual insurance – risks having two or more policies with same coverage (Both the individual policies would not pay separately- a concept named contribution, and would contribute together to make up the policyholder's losses. However, in case of contingency insurances like Life insurance, dual payment is allowed)
  • Self-insurance – situations where risk is not transferred to insurance companies and solely retained by the entities or individuals themselves
  • Reinsurance – situations when Insurer passes some part of or all risks to another Insurer called Reinsurer

Thursday, September 14, 2017

How is Testing Different for Healthcare Applications?

An approach for testing an application is dependent on many factors and domain / industry type is one of them. Why? Because it helps in enumerating critical risks associated with an application, which can’t be ignored. For example, in healthcare, the most important aspect of testing is in safety of patients and compliance with government regulations. Listed below are tips and pointers to consider while testing the healthcare provider application(s).

Introduction to Health Care Domain / Industry

Healthcare Compliance & Regulatory Environment
Healthcare compliance and regulatory environment are among the most complex to understand and apply. Most of these regulations are in place to ensure that hospitals protect the patient health records. This makes security testing utmost important for any healthcare applications. The the risk of violating the compliances is very risky and damaging. Below are some example of regulations need to be complied by the healthcare industry in the USA.
  • HIPAA- Health Insurance Portability and Accountability Act of 1996 is United States legislation that provides data privacy and security provisions for safeguarding medical information
  • HITECH- The Health Information Technology for Economic and Clinical Health Act legislation was created in 2009 to stimulate the adoption of electronic health records (EHR) and supporting technology in the United States

Conformance to Different Standards
Usually, different applications are deployed at different departments of a hospital or clinic. Often, these applications communicate with remote departments of the other hospitals and stakeholders (e.g. Insurance) as well. Having seamless communication along with standard protocols is must for scalability and sustainability of the applications. Testers need to understand different standards supported by application/component and must validate its integration with different interfaces. Some of the most common standards being followed in healthcare industry are as follows -
  • HL7 (Health Level 7) - set of international standards for the transfer of clinical and administrative data between software applications used by various healthcare providers. These standards focus on the application layer, which is "layer 7" in the OSI model
  • DICOM (Digital Imaging and Communications in Medicine) - standard for storing and transmitting images. It includes a file format definition and a network communication protocol
  • HL7 CDA (Clinical Document Architecture) - provides an exchange model (XML-based) for clinical documents (such as discharge summaries and progress notes); recently known as the Patient Record Architecture (PRA)
  • CCR (Continuity of Care Record) - a standard for the creation of electronic summaries of patient health.
  • CCOW (Clinical Context Object Workgroup) - International standard protocol designed to enable disparate applications to synchronize in real time, and at the user-interface level
  • LOINC (Logical Observation Identifiers Names and Codes) - Universal standard for identifying medical laboratory observations. It applies universal code names and identifiers to medical terminology related to electronic health records. The purpose is to assist in the electronic exchange and gathering of clinical results (such as laboratory tests, clinical observations, outcomes
  • ELINCS (EHR-Lab Interoperability and Connectivity Standards) - An emerging standard for reporting lab test results

Complex Workflows & Data Integrity
  • Interconnected workflows should be tested considering various parameters like different types of tests, operations, consultancies, plans, brokers, members, commissions etc
  • Unlike other domains, the healthcare software needs to be tested in a certain order that follows the patient flow through the system
  • Validation & Verification of complex images generated during workflow
  • Give attention to workflows that allow an action when it should be restricted e.g. application might allow users to place medication orders (prescriptions) for patients who had already been discharged
  • Different healthcare applications from different vendors might be used to cover a workflow, which makes E2E testing difficult
  • There might be many embedded calculations used to produce dosing amounts on patient medication orders. Medication doses differ among children, infants, and adults. If dosing calculations are incorrect, a pediatric patient could receive an adult dose that could be lethal
  • For medical applications, it needs to verify that the medications, dosages, units, and data are exactly as entered and remain that way between application sessions. It's also vital to check for data corruption, as well as hard-to-read text or images that might cause confusion
  • Workflows can be highly dependent on test data being used, e.g. only the elderly are eligible for geriatric treatments, infants for neonatal procedures and women for gynaecological treatments
  • Dates used as test data are important and needs to be accurate and in context
  • Data displayed on screens are used by doctors or nurses for taking decisions on patient prescriptions. It's easy in the midst of testing to glance over the display without truly reading it, so focusing on the data accuracy should be the key

Usability, Security & Performance
  • Most of the time the users of healthcare application are not trained computer professionals, so the user experience is a key factor
  • It is advisable to use concise and clean data on the display screen for users who work in high-stress environments such as an emergency room or operating room
  • Being HIPAA and HITEC compliant, conducting in-depth security testing is implicit
  • Number of concurrent users in a healthcare application might not be huge (in millions) but performance of the application needs to be validated against multiple big media files being transferred over the network

Wednesday, September 6, 2017

DevOps FAQ

What is DevOps?

"DevOps is the combination of cultural philosophies, practices, and tools that increase an organization’s ability to deliver applications and services at high velocity. This speed enables organizations to better serve their customers and compete more effectively in the market. DevOps moves the focus from development to delivery—a subtle but important distinction."
What is DevOps?

In a simpler term, DevOps is a ‘Soch’ (a philosophy) which encourages frequent feedbacks from customers and accordingly make changes to plan, design or deliverables. It discourages HiPPOs (Highest Paid Person’s Opinion) culture in the company. For frequent feedbacks, release should be available to customers frequently in the production environment. This implies, goal of DevOps is to minimize time elapsed from ‘acceptance of change request’ to ‘availability of quality code to customers’. Companies can achieve nirvana in DevOps by automating everything in their delivery process. But DevOps is not all about automation, some companies can minimize the deployment time by just changing the culture (e.g. removing red tapes during handover of work to different teams, a concept of Go-NoGo manual approval).

What are some of the effective DevOps practices?
To realize the goals of effective collaboration & smoother operation, below practices are adopted -
  • Open and honest participation of all stakeholders (developers, operations staff, testers, support people)
  • One version control system for all codes (application code, test scripts, Ops scripts), accessible to all stakeholders
  • Small, frequent and simple changes
  • Never break the consumer code
  • If failing, fail it fast and often
  • Setup automated builds and tests
  • Implement Continuous Integration (CI)
  • Deploy same binaries across all environments (Development, Integration, Staging, Production etc.)
  • Continuous deployment from one sandbox environment to next one until production environment
  • Continuous monitoring of application as well infrastructure
  • Real time insights to organization’s governance team through automated dashboard
  • Simple and flawless rollback strategy

How is DevOps different from Agile?
DevOps is an extension of Agile methodology. In Agile methodology, the focus is on building the working software quickly, but DevOps is responsible for development as well deployment of software in the safest and most reliable way possible. If Agile talks about Continuous Integration (CI), DevOps talk about Continuous Integration (CI) and Continuous Delivery (CD). The most important aspect of DevOps is a continuous collaboration between development & operation team.

Why Enterprises moving/need to move to DevOps?
To stay competitive in the current market, enterprise organizations are embracing DevOps methodologies and new technologies to accelerate innovations. Agile development reduces the time of development but not deployment. This post development tasks take too long and delay getting new business functionality to users. DevOps brings the agile team focus on communication and collaboration between developers and the operation team, leading faster customer feedback.

What are some common tools being used in DevOps ecosystem?
DevOps philosophy doesn’t mandate any tools and rather emphasizes communication and collaboration between product management, software development, and operation professionals to build, test and release software rapidly, frequently and reliably. But for an effective DevOps implementation, it is important to understand tools and implementing an effective continuous delivery tool chain. The key is to automate at every opportunity where ever possible during software delivery process.
Different types of tools required for DevOps implementation are –
  1. Orchestration of the deployment pipeline visualization
  2. Version control of source code (Dev, Test & Ops)
  3. Automation of code quality check
  4. Automation and management of builds (CI)
  5. Automation of testing (unit testing, integration testing, system testing, user acceptance testing, service virtualization, performance testing, security testing etc.)
  6. Management of Artifacts (code binaries, resources etc.)
  7. Tools for automatically provisioning the environments (physical, virtual, private and cloud)
  8. Tools for server configuration and deployment management
  9. Tools for monitoring and reporting
CD ChainNotesTools
Orchestration and Deployment Pipeline Visualization Deployment pipeline steps can be built and visualized by integrating the entire existing tool chain. It helps team to detect delays & wait times between each steps. ElectricCommander, CA LISA, IBM UrbanCode, XebiaLabs XL
Version Control Source code and configuration files are version controlled. All text based files, getting changed by team members should be added into the version control Git, Mercurial, Perforce, Subversion, TFS, Bazaar, CVS
Continuous Integration Integrate new code with stable main release line and alert stakeholders if it causes issues in the final product Jenkins, Travis CI, ThoughtWorks GO, CircleCI, TeamCity, Bamboo, Gitlab CI
Continuous Inspection Automatic & continuous audit of code quality in terms of maintainability, coding standards, future bugs etc. SonarQube, CheckStyle, JavaNCSS, CPD, FindBugs, PMD
Artifact Management Focus is on packaged artifacts like application assets, virtual image, configuration data, infrastructure code etc. Artifacts are identifiable, versioned and immutable. Package metadata identifies how and when it was tested and against which environment Nexus, Artifactory, Archiva
Test Automation The goal should be to automate regression suite completely except scenarios, which can’t be automated. JMeter, Selenium/WebDriver, Cucumber (BDD), RSpec (BDD), SpecFlow (BDD), LoadUI (Performance), PageSpeed (Performance), Netem (Network Emulation), SoapUI (Web Services)
Environment Automation Test environment can be brought up on-demand using automation tool, which can provision VMs and applying configuration template. Vagrant, Docker, Packer
Server Configuration and Deployment Management Tools are used to deploy binaries into the required environment(s). Team needs to ensure that process is fully automated and must have capability of rolling back to previous stable version w/o any issues Ansible, Chef, Puppet, SaltStack
Monitoring and Reporting Log files from all system can be aggregated at centralized location, should be indexed and searchable from web browser

Real time insights to organization’s governance team through automated dashboard
Application: New Relic, Dynatrace, AppDynamics

Infrastructure: Nagios, Sensu

Logs: Splunk, Fluentd, Heapster, Logstash, Prometheus, WeaveScope

Can you also point out some common myths around DevOps?
Some of the common myths around DevOps are –
  • Adopting tools make you DevOps
  • DevOps is about 10 deploys a day
  • DevOps and continuous delivery are same
  • Continuous delivery is only for web companies
  • DevOps is all about automation

What skills are necessary to become a DevOps engineer?
Automation is an integral part of DevOps and the key is to automate at every opportunity where ever possible. Apart from communication and collaboration, an engineer needs to have multiple skills to work in DevOps culture. Below is the list of skills required in a team for implementing DevOps successfully. It is highly unlikely that one engineer will have ALL the skills.
  • Senior level Windows / Linux administrator who can build and administer servers
  • Relevant and practical virtualization experience with VMware, KVM, Xen, Hyper-V
  • Having good grasp on storage and networking, who can design a solution that scales and performs with high availability and uptime
  • Understand fault tolerance and failure domains
  • Must have experience in scripting in one language at least like Bash, Powershell, Perl, Ruby, JavaScript, Python etc. so that code can be written to automate repeatable processes
  • Practical experience in deploying applications in Amazon AWS, Google or Azure
  • Broad understanding of Security & Performance testing
  • Broad understanding and practical experience in below tools and technologies
    • Version control tools like Git, Bitbucket, SVN, VSTS, Perforce
    • Continuous Integration tools like Jenkins, Bamboo, TFS, Maven, Ant, CruiseControl and Hudson
    • Automation tools like Selenium, Cucumber, VSTS etc.
    • Infrastructure automation or Configuration Management tool like Puppet, Chef, Ansible, Vagrant, CFEngine etc.
    • Containers like Docker, LXD
    • Orchestration like Kubernetes, Mesos, Swarm etc.
    • Monitoring tools like Nagios, Munin, Elk, Zabbix, Sensu, LogStash, CloudWatch and NewRelic etc.

What are KPIs related to DevOps?
The success of DevOps implementation can be measured using below key metrics:
  • Deployment frequency – How many deployments per day/week?
  • Deployment speed – The time elapsed from committed code to its deployment at customer
  • Deployment failure rate – Number of failed deployment
  • Mean time to recover (MTTR) – time to recover from a given failure
Answer of these metrics will lead in finding out the maturity level of DevOps implementation at customer place

What is the role of testers in DevOps?
DevOps is all about delivering "quality" applications and services at high velocity. When software is being delivered at high velocity, quality becomes the biggest risk and testing needs to be conducted everywhere and should be done by everyone. Software testing practices and mindset are an inseparable part of DevOps.
The testers are looped in during the complete software delivery process, from ideating (shift-left) to production monitoring (shift-right). DevOps centric QA strategy needs to be implemented and Continuous Testing becomes an integral part of a delivery chain. CI/CD becomes meaning less without Continuous Testing. Below are some key roles of QA in DevOps –
  • Development of test automation suite for various environment (Dev, Integration, Pre-Production, and Production
  • Integration of test automation in the CI / CD tool set
  • Development of DevOps centric test strategy
  • Validation of UX designs
  • Nonfunctional testing like performance & security

What is Continuous Testing? How does it play role in DevOps?
Continuous testing is the process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on the business risks associated with a software release candidate.

A few years ago, developers were delivering software at lightning speed in a sprint leaving QA to catch up on sluggish testing practices with minimal coverage across frequent builds. The widespread culture of Agile development has accelerated development while software testing lagged, forcing organizations to cut corners in QA or slow down Dev processes entirely. Continuous Testing in DevOps mitigates risks by going beyond automation and encompassing all DevOps practices – including tooling and culture change. Continuous Testing tries to achieve four capabilities: Test early, test faster, test often and automate.

E2E test automation practice integrate QA into existing fast-paced Dev and Ops process and create continuity while maintaining faster development cycles. Collaboration, tooling & E2E test automation helps in identifying defects early in the life cycle.

Continuous Testing starts within development process when Dev use open source tool like Selenium to test the functionality of their code. Tools such as GitHub are used to store tests and version together with the software code. DevOps use the same tests to integrate frequent builds. Once the code reaches pre-production environment, the tester can use the same tests by modifying test parameters as appropriate. In the production environment, Ops can reuse this tests for acceptance testing and ongoing production monitoring. Once the application goes live, the same tests can be integrated within APM tool to keep a tab on server side app metrics.

What is Continuous Monitoring? How does it play role in DevOps?
During Continuous Monitoring, the Ops team monitors and ensures that the application is performing as desired and production environment is stable. So far, it was being done by specialized tools owned by Ops team. DevOps principles suggest monitoring applications as well, which means Ops to use tools that can monitor application performance as well as any (functional) issues along with system level monitoring. It may also require Ops to work with Dev / Test team to build self-monitoring or analytics gathering capabilities right into the applications being built, which would allow for true continuous E2E monitoring.

What do you understand by "Infrastructure as code"? How does it fit into the DevOps methodology? What purpose does it achieve?
"Programmable Infrastructure" is a synonym of "Infrastructure as code". With "Infrastructure as code", there is no need of manually configuring the infrastructure. Tools like Vagrant, Ansible, Docker, Chef, Puppet assists in incorporating infrastructure configuration in the application code. These tools help in achieving full automation in the configuration management. Developers not only can write the application code but can write the infrastructure that the code runs on. It helps QA to avoid the “It worked on my machine” problem and allowing the full suite of test unit to integration to functional to be run on brand new instances of infrastructure.

Monday, September 4, 2017

Introduction to Health Care Domain / Industry

The healthcare industry, or medical industry, is a sector that provides goods and services to treat patients with curative, preventive, rehabilitative or palliative care. This industry comprises of different players including hospitals, doctors, nursing homes, clinics, diagnostic laboratories, pharmacies, medical device manufacturers, and other components of the health care system. It is one of the fastest growing industries in the world. It is important to be aware of different business processes/functions involved in this industry before making a career (e.g. software testing practitioner in healthcare) into it. Listed below are the key business functions associated with this industry (Healthcare Providers). Each business functions might have a separate (testing) challenge to solve but that will be discussed later and not in this blog.

How is Testing Different for Healthcare Applications?

Patient Administration System
Patient Registration & Enquiry
  • Registration of new patients and updation of existing patient details
  • Management of patient details at a centralized location. The details include demographic information of the patient and their medical records. The patient details can be searched by HL7 identifiers (visit number, account number, MRN, EMPI)
  • Advance multi criteria search for already registered patients
  • Recording insurance details
  • Appointment Scheduling
  • Booking and scheduling an appointment with physician or lab personnels
  • Cancellation or rescheduling of appointments
  • Searching availability of doctors or consultants for appointments
  • Admission, Discharge, and Transfer
  • Search for availability of bed or rooms and allocating it as per the cost associated
  • Final billing and settlement
  • Accident and Emergency Management
  • Storing all legal cases related to accident and emergency cases
  • Keeping track of all emergency services provided to patients
  • Storing complete patient information
  • Keeping all medical reports
  • Transfer details
  • Referral Patient Management
  • Referring patients to different healthcare organization
  • Recording details of patients being referred

  • Clinic Management System
    Duty Roster
  • Management of the duty shifts of the employees
  • Appointment scheduling wrt to shift timing of the consultant
  • Management of substitutes in case of absent employee
  • Scheduling overtime
  • Nursing and Wards Management
  • Track patient information when shifted from one ward to another ward
  • Bed cleaning request post discharge of patient
  • Allocation of nursing staff and management of their schedule for wards/beds
  • EMR and Clinical Data Repository
  • Central repository of all medical records and history of patients
  • Able to store and manage huge amount of clinical records at centralized location
  • Archiving of medical records, handling location tracking, history, and label printing
  • Documenting at patient’s home by caregivers like nurses and syncing with centralized database
  • Clinical coding of medical record

  • Clinic Management Support System
    Operation Theater (OT)
  • Scheduling the surgery, managing the surgery team, recording the surgery details and maintaining checklist related to surgeries
  • Electronic consent for surgeries from self or relatives
  • Maintains the preoperative and postoperative conditions of the patient
  • Inventory and stock management of the OT
  • Pathology Laboratory
  • Receive online requests generated by doctors/staff
  • Sticker printing with bar codes to ensure sample identifications
  • Inventory management
  • Print and generate test reports and bill receipts
  • Search for different test results of different patients
  • Real time communication of test results with different clinics
  • Differentiation of normal and abnormal values of test results
  • Tracks each patient’s laboratory sample and the test results obtained for the same
  • Radiology & Imaging
  • Capture images from Radiology machines and store it in the JPG or other standard format and integrate with other clinical workflows
  • Images can be sent in an electronic format
  • Dietary Services
  • Assistance to the kitchen in providing meals to inpatient as per the instructions
  • Maintenance of meal scheduling, customizing meals as per patient meals and recording of individual meal orders
  • Captures the calories count and nutritional value of all meals
  • Blood Bank
  • Generate report on basis of area, stock-blood group wise or expiry
  • Database of donor and blood groups by area wise
  • Maintain information of all donor types - voluntary, exchanged or directed
  • Housekeeping, Linen & Laundry
  • Manage and monitor housekeeping activities in a hospital
  • Housekeeping activities like room preparation, sweeping/mopping floors, dusting furniture, cleaning walls, bathrooms, linens etc.
  • Scheduling the cleaning various areas of a hospital
  • Scheduling change of linen of inpatients
  • Maintaining counting of incoming and outgoing laundry items
  • Biomedical Waste management
  • Schedule collection and disposal of Biomedical product
  • Categorization of waste into different types and color codes
  • Radioactive waste disposal management
  • Transportation management
  • Keeps track of ambulances available for service
  • Manages transport vehicles services and contract from external vendors
  • Scheduling of vehicles/ambulances and drivers
  • Managing emergency facilities provided in the ambulance

  • Material Management System
    Inventory, Supply, and Procurement
  • Tracking of each and every item consumed within the healthcare organization
  • Online purchase requests from various stores
  • Receiving quotations and issuing purchase order to various vendors
  • Automatic alert when minimum stock level is reached
  • Alerts for items when near expiry
  • Racking and shelving of the store items
  • Pharmacy Management
  • Managing drug distribution, stock management, and monitoring functions
  • Batch number, drug interactions, manufacturing dates, expiry dates are stored properly
  • Provides extensive list of available drugs
  • Alerts generation when stocks reached to minimum level
  • Centralized Sterile Supply Department (CSSD)
  • Provide sterile items, equipment to the ward and operation theater
  • Records daily instrumentation, etc. received by various departments for sterilization
  • Classification of instrumentation as per the type of sterilization technique
  • Locating various items in the CSSD
  • Tagging the items with the sterilization date
  • Monitoring the quality of sterilization

  • Revenue Management System
  • Patient Billing
  • Insurance and Contracts Management
  • Claims Management
  • Healthcare Packages
  • Accounting & Finance

  • Administration Management System
  • HR / Payroll
  • Hospital / Clinic Administration
  • User and Security Administration - scanner, bar codes etc.
  • Management Information System / Reports Management

  • Tuesday, August 15, 2017

    Key Agile Metrics for a Sprint

    In any agile program, it is important to track both business and development progress metrics. Agile metrics helps a team to better understand their development process and also releasing quality software easier and faster. There is a big debate on the usage of some metrics and there are concerns of using it in teams. Usually, usage of metrics is guided by three rules -
    1. When a measure becomes a target, it ceases to be a good measure (Goodhart's Law)
    2. Measures tend to be corrupted/gamed when used for target setting (Campbell's Law)
    3. Monitoring a metric may subtly influence people to maximize that measure (The Observer Effect)
    Below are some popular key agile metrics at the sprint level as the sprint progresses.

    Pre Sprint Execution
    Business Value A value given to user story by the product owner, representing its impact to stakeholders
    During Sprint Execution
    Work in progress (WIP) Number of tasks that are in progress
    Burndown / Burnup Chart Shows the trend of remaining effort for the sprint. In case, new tasks are being added in the middle of sprint, Burnup Chart should be used instead
    Cycle Time Total time taken by individual issues to get from “in progress” to “done” state
    Velocity Rate at which team is completing and delivering stories
    % Automated Test Coverage Percentage of code base or requirements being covered by automated tests
    Test Pass / Fail Over Time Shows the trend of testing progress in the sprint
    Defects Trend (CFD) Shows the trend of product quality during development in the sprint
    Post Sprint Execution
    Customer / User Satisfaction Count of smiley face indicators after the demo
    Team Happiness Count of smiley face indicators post retrospective meeting
    Story Committed vs Completed (On-Time Delivery) The ability of a team to predict its capabilities of effort estimation

    Business Value
    The purpose of any software development effort is to create the features that deliver business value. There are two questions associated with it; (1) How do we know if we are delivering value? (2) Are we delivering the right thing? The metric "Business Value" can measure value being delivered per sprint in terms of points or dollar amount, but there is no way of tracking the real impact of the software until it is released. Some more key points related to this metric are as follows -
    • Product owner prioritizes higher value items towards the top of the backlog so that each sprint can deliver the maximum value possible
    • There is no standard formula for measuring value, but a clear view of what "value" means to the stakeholders needs to be articulated at the beginning
    • The product owner can use different techniques such as "T-shirt sizing" in prioritizing the project stories or an alternate approach can be to use a three dimensional metric, which incorporates complexity, business value and the ROI
    • For a project with definite end, the sprint starts early with very high value and gradually tend towards delivering less and less value
    • At some point, the cost of development eclipses the potential value of running another sprint, and a good time for the team to switch to a new product

    Work in progress (WIP)
    Multitasking becomes the norm while working on multiple items and leads to WIP. Multitasking sounds good, but it is deceptively time-consuming. It is most likely that work is waiting and the team is switching tasks when WIP is high. Hence, limiting the amount of WIP improves throughput and force team to complete their work. At a fundamental level, WIP limit encourages a culture of "done".
    • WIP limit determines the minimum and maximum amount of work that can stay in each status of workflow
    • The goal of WIP limit is to ensure that everybody has work to do, but no one is multitasking
    • As a best practice, some teams set the maximum WIP limit below the number of team members. If someone finishes an item first, and the team is already at their WIP limit, he or she can join another developer/tester to knock out item(s) from their plate
    • Resist the temptation of raising WIP limit just because the team keeps breaching it. Understand the reasons behind it first and act accordingly
    • Consistent sizing of individual tasks helps in setting WIP limit correctly. It is important to keep individual tasks to no more than 16 hours of work

    Burn-down / Burn-up Chart
    Burn-down and Burn-up chart both are used to track and communicate the progress of a project. A Burn-down chart shows how much work is remaining to be done in the project, whereas a Burn-up chart shows how much work has been completed against the total amount of work. The Burn-down chart is simple but sometimes hide important information. The Burn-up chart avoids ambiguity and gives a complete picture of the project progress.
    • People get confused with the effort spent and the effort remaining. If these are wrongly plotted then the report insight will be inaccurate
    • In Burn-down chart, sometimes it appears that team didn’t accomplish much in the middle of project but heroically finished everything at the end whereas if the same thing is plotted using Burn-up chart, it might reveal that probably scope has increased in middle and items were removed at the end to meet deadline and team made steady progress all along
    • The Burn-up chart is preferred over Burn-down chart when project progress is being presented on a regular basis to the same audience. Apart from showing the steady progress, it allows showing changes in the scope due to the addition of more work items or testing revealing significant bugs. It might help in convincing customers to stop requesting changes
    • Both Burn-up and Burn-down chart help in showing the velocity of a team, which can be compared against the velocity required to meet the deadline
    • Burn-down charts are usually used at the Sprint level whereas Burn-up charts are mainly used at the release or project level

    Cycle Time
    It is the measure of the elapsed time when work starts on an item (story, task, bug etc.) until it is ready for delivery. In DevOps era, it is measured until the task is deployed in the production.
    • It tells how long it takes to complete a task. So, in case if an issue is reopened, worked on and completed again, then this extra time is also added to the Cycle time
    • Team with consistent cycle times across many types of work (new feature, bugs etc.) are more predictable in delivering work and can answer business owner with data-driven estimate
    • Cycle time is a direct measurement of productivity. It is quicker to introduce new features to end users with short cycle time
    • If Cycle time is in couple of days, different reasons for this could be
      • Story is too large
      • Task is not well understood by team
      • Definition of done has expanded
      • Work in progress (WIP) is high

    It determines the regular run rate of the team efficiency, the rate at which the team delivers a story. It is one of the most hated metrics as it can be gamed easily. So, the temptation of comparing against teams must be resisted otherwise instead of focusing on delivering working software that has business value for stakeholders, the team will be concerned only with delivering more story points. The trend should be rather analyzed and retrospect to figure out the reason for a change in velocity.
    • Velocity in terms of stories done is better than story points
    • If velocity is erratic over a long time, team estimation practices should be revisited
    • There are many ways velocity can be increased
      • Team might start estimating higher effort
      • Stories will tend to become smaller (it is a good sign though)
      • Team can put less effort in refactoring or testing
    • There can be many other reasons for change in the Velocity other than team issues
      • Changes in the team size between sprints, new member joins or a veteran has left
      • Sprint is targeting short release cycle or maintenance work
      • Team doesn’t understand the scope of work at the start of the sprint
      • Team is working on something new (technology, domain etc.)
      • Team is working on a legacy code
      • Many holidays or sick leaves in the current sprint
      • Team had to deal with few critical bugs

    % Automated Test Coverage
    Software projects become more complex as time proceeds due to the increased line of code (added features, defects etc.). Complexity over time tends to decrease the test coverage and quality of the product. The goal of automation is to reduce the time of testing and cost of delivery while increasing test coverage and quality. Carefully defined metrics can provide insight into the status of automated testing effort and one such metric is percentage automated test coverage. It determines how much of the code base and functionalities are covered by automated tests in a sprint. The automated tests include both unit and acceptance regression tests.
    • It is practically impossible to do full regression testing in a sprint without automated tests
    • All bugs fixes should also be automated as part of test coverage

    Test Pass / Fail Over Time
    As the application gets larger with each sprint, the total number of tests executed and passed should continue to increase. The warning should be flagged for failed tests, especially if it is a priority defect. Any red tests for a longer time warrants investigation and resolution. This help to reach the agile objective of software that is releasable and high quality at any given time on a continuous basis.
    • The value of this metric should increase as the project progress. If it doesn’t happen then it might be because QA team is not able to close the defects
    • When test pass rate decreases after a steady increase, then it might be because QA team started re-opening defects

    Defects Trend (CFD)
    The Defects trend chart shows the cumulative defects opened versus cumulative defects closed in the sprint. The chart shows the submission rate of defects and the rate at which it is getting closed in form of cumulative flow diagram (CFD). If the distance between cumulative open and cumulative closed is shorter then it shows team is efficiently resolving the defects otherwise it warrants investigation. Some of the questions that can be asked from the defects trend are –
    • Are defects submission rate declining toward the end of the sprint as expected?
    • Are new defects being found at all or not?

    Customers / Users Satisfaction
    Any product is made for its customers and it is very important that focus should be towards fulfilling their needs. Every sprint should target towards increasing value to its customers. The customer/user satisfaction is one of the most important metrics to track. One of the ways to measure could be to show a demo of the product after every sprint and count their smiley faces or it could be more formal by sharing surveys to customers and stakeholders. Are customers willing to recommend company’s product or services to others?

    If the count is low, does this mean the team is not doing a proper job? This is a hard question but there could be many other factors for this value to be low –
    • Product owner is not clear with the customer requirements
    • Customer is not involved in developing stories or defining story acceptance criteria
    • Sprint reviews are not being conducted properly
    • Customers and stakeholders are usually not present during sprint reviews

    Team Happiness
    A key metric for a successful scrum team. If team mates aren’t enthusiastic and not having a passion for work, no process or methodology can help in further improvement. Usually, if a team is not happy, its side effects will start showing in future iterations, in forms of more defects injection, less velocity, more reopen defects etc. This metric is very hard to measure as not every time team mates could be vocal to raise concerns or showing their unhappiness during the retrospective meeting. Some of the reasons for team unhappiness could be –
    • There are high number of impediments during the sprint and are not being removed in a timely manner
    • Team members can’t contribute in a product area because they lack knowledge or experience
    • Team members are working long hours’ sprints after sprints
    • There are internal conflicts among team members and they are not working collaboratively
    • Repeated mistakes are not being acknowledged or addressed
    • Team members not being encouragement or valued and lost passion for their work
    • Agile metrics being used to target individuals or teams

    Story Committed vs Completed (On-Time Delivery)
    This metric is a way to measure predictability. It can be measured by comparing the number of stories committed versus stories completed in the sprint. One thing that an agile team should definitely be able to do is to deliver software by a certain date. The score of this metric can be low due to several reasons other than the team effectiveness –
    • Team doesn’t have an applicable reference story to make relative estimates
    • Not every team members are experienced in the story’s referenced domain or technology
    • Customer requirements are not clearly communicated to the team
    • Requirements keep changing (scope creep) during the sprint
    • Many changes in the team (team disruption)
    • Changes need to be done in legacy code (new to many team members)
    • One team member has taken decision of estimation as per his / her capability and thought process

    Sunday, July 30, 2017

    Testing in Production (TiP) Approaches and Risks

    What & Why of TiP
    "Testing in production (TiP) is a set of software testing methodologies that utilizes real users and production environments in a way that both leverages the diversity of production, while mitigating risks to end users. By leveraging the diversity of production we are able to exercise code paths and use cases that we were unable to achieve in our test lab, or did not anticipate in our test planning"
    A to Z Testing in Production: TiP Methodologies, Techniques, and Examples

    "In today’s world, testing in production is not only important, but also critical to ensure complex applications work as we expect them to. Particularly with the move to devops and continuous delivery, testing in production has become a core part of testing software"
    An Introduction to Testing in Production

    TiP Approaches
    TiP is a growing practice in the industry mainly due to growing complexity of application, adoption of continuous delivery & deployment and reducing the BUFT (Big Up Front Testing) cost. Setting up a production like environment for testing is also a another challenge for testing complex service(s). Some of the prominent TiP approaches are as follows-
        1. Production Data Analysis
        2. Exposure Control
        3. Synthetic Tests in Production
        4. Destructive Testing

    Production Data Analysis
    In this approach, existing production data is validated and analyzed against real users and systems behaviour. Production environment is monitored closely for any alarms and issues being fixed as per priority. There is no change to production environment while using this approach and all testings / operations are strictly read only. We need to be cautious on configuring too many monitors though as this might result into network congestion and should be avoided. Different types of production data and examples of what can be analyzed by testers are noted below.

    Transaction data »  Are users abandoning the site during checkout process?
    »  Are significant number of payment transactions getting failed?
    »  Are videos getting errored out during play?
    »  Are transaction data being retained as per the policy?
    »  Do you see stale resources in different regions? Is content being propagated across servers?
    »  Are users abandoning the site during checkout process?
    »  Are there any broken links or unavailability of resources like images / videos from site?
    Logs (application, errors, events, audit, database, network, etc.) »  Are number of errors have increased post deployment of new build?
    »  Are DOS attacks in place?
    »  Are results vetted out by search engine is meaningful?
    »  Are there lots of connection errors?
    »  Are there certification related errors?
    Monitoring logs »  Are transactions resulting into errors? From which region? At what time?
    »  Are users facing the latency issues?
    »  Is service stable or going down frequently?
    »  Is production up and running during deployment time?
    »  Has volume of logging changed?
    Web analytics tool like Google analytics »  Is bounce rate high?
    »  Is performance issue visible for some region, devices or browser combination?
    »  Do you see sudden increase in 404 or other errors?
    »  Which scenarios taking maximum time?
    Feedback loop »  Are there negative feedbacks present online?

    Exposure Control
    This method can be used to slowly rolling out the software version to subset of users before rolling out to entire infrastructure and make it available to all. This approach makes sure, if failure is there it should fail fast and bring the improvement quickly to the market by getting feedbacks from real users. Different ways exposure of code / features can be controlled to few users before releasing it to all users are listed below.

    A/B testing (Experimentation for Design) »  It is also known as split testing or bucket testing.
    »  It is essentially an experiment where two or more variants of a page are shown to users at random in an unbiased way, and statistical analysis is used to determine which variation performs better for a given conversion goal.
    »  Testing one change at a time helps in pinpointing the changes had effect on the visitors behaviours. Over time, effect of multiple winning changes from experiments can demonstrates the measurable improvement
    »  Cut losses by detecting ineffective features early and increasing investment on successful features
    »  New user experience is usually well-tested prior to experiment
    »  While canary releases are a good way to detect problems and regressions, A/B testing is a way to test a hypothesis using variant implementations
    »  There is a possibility of data being compromised during experience(s) and it is very important to keep redundant stores and consider read-only access on ramp-up of users. Rollback must be absolutely easy and reliable for the success of this approach.
    Canary Deployment »  It is about deploying new code to a small sub-set of the machines in production, verifying that the the new bits didn’t cause regression (functionality and performance), and slowly increasing the exposure of the bits to the rest of the machines in production
    »  It is started with deploying new version of software to subset of infrastructure, to which no users are routed. On success, routing is started to few selected users. Later after gaining more confidence, it is released to more servers & users
    »  Migration phase lasts until all the users have been routed to the new version. In case of any issues, users are re-routed back to old infrastructure. Old infrastructures are decommissioned on complete satisfaction
    »  Conducting capacity testing is easier using canary releases along with safe rollback strategy
    »  It can take days to gather enough data to demonstrate statistical significance from an A/B test, while canary rollout can be carried in minutes or hours.
    »  One drawback of using canary releases is that multiple version of softwares need to be managed
    »  Another scenario where using canary releases is hard is when software is distributed that is installed in the users' computers or mobile devices
    »  Managing database changes also requires attention when doing canary releases.
    »  With A/B Testing we experiment with multiple experiences in order to make sure that we build the right thing, but with canary releases, it is checked if code is built right
    Feature Switches »  It is about incrementally building the features in the production environment w/o making it available to real users
    »  Helps gradually building larger features into the existing code without slowing the release cycles
    »  Usual battery of tests can be executed w/o disturbing the real user behaviours

    Synthetic Tests in Production
    Synthetic tests are automated tests running against the production instances. Synthetic tests can be divided to two groups, API Tests and User Scenario Tests. “Write once, test anywhere” principal is preferred where same tests can run against test and production environment. Production monitors / diagnostics are enabled to assess and report pass / fail. Performance metrics are monitored very closely and test automation is shut down in case of any kind of unacceptable impact on the user experience.
    End to End User Scenarios Execution »  Runs as subset of an application’s automated tests against the live production system on a regular basis.
    »  Results are pushed into monitoring service, which triggers alerts in case of failures
    »  Need to be caution on following items though -
    Test users should not be visible to real users or can interact with real users
    Tests should be hidden from search engines
    It should not influence the continuity and stability of real users
    Test data should be tagged so that can be isolated and cleaned easily
    Limited number of test users should be created
    PII data should be avoided completely for any testing
    After verification, test transaction data (e.g. test orders) should be cleaned
    Real user data should not be modified, and synthetic data must be identified and handled in such a way as to avoid contaminating production data
    Load »  Simulating the production environment in lab is usually very difficult and a primary reason of doing load testing in production even when it is inherently risky
    »  Synthetic load are injected onto production systems, usually on top of existing real-user load
    »  Real traffic data (user workflows, user behaviours and resources) should be collected for simulating load tests
    »  Service virtualization can be used for emulating response from 3rd party services / back office
    Tests should be conducted when usage are low
    Generating load on an application involving a third-party would indirectly generate load on the partner’s environment, which is NOT legal until informed the said party in advance.
    Requires constant monitoring of entire production environment and tests should be stopped immediately to avoid any production issues
    Avoid steps that generate records in the back office (avoid validating the order)

    Destructive Testing
    In this approach, the goal is to see how soon the system can recover when failure happens. Infrastructure faults are deliberately injected into production system (e.g. services, servers, network etc.) to validate service continuity in the event of real faults or to find MTTR (mean time to recovery). According to Netflix engineers Cory Bennett and Ariel Tseitlin, “The best defense against major unexpected failures is to fail often. By frequently causing failures, we force our services to be built in a way that is more resilient.” Below are some tools, which are being used by industry for destructive testing in the production environment.
    »  Netflix’s "Simian Army", a set of destructive scripts that the company deploys to simulate various failures
    »  "Latency Monkey" induces artificial delays
    »  "Conformity Monkey" finds instances that do not adhere to best practices and shut them down
    »  "Janitor Monkey" searches for and disposes of unused resources
    »  "Chaos Monkey" introduces failures on purpose or randomly kills production instance in AWS

    Monday, July 17, 2017

    Where is Manual Testing in Continuous Deployment?

    Transformation from Waterfall to Agile methodology is almost over. Most of the companies are bracing Agile methodology for their release management. Some have perfected and others are catching up. Few have moved forward and took Agile to the next steps following approach of Continuous Delivery / Continuous Deployment. All these changes are suggesting a clear pattern where end users expectations are being valued and served quickly with quality delivery. Days might not be very far, when users can track delivery of their favorite features or changes similar to items being tracked via FedEx (or any other courier service) :). Does it suggest that there won’t be any place for manual testing in future? In this blog, we will try to explore this answer.

    Before we delve into details, it important to understand the difference between Continuous Delivery & Continuous Deployment. In Continuous Delivery, decision of releasing to production is manual and release can be delayed depending on the risk or who is taking the decision. In Continuous Deployment, build is automatically deployed to the production once the code is checked-in. In Continuous Delivery, there is still time for manual or exploratory testing that can be done on staging environment(s) before taking build to the production.

    How about manual testing in “Continuous Deployment (CD)”? When it is being done or does it happen at all? Is it being done by developer alone or tagging along with tester before code check-in? Are the defects being directly found and raised by users? Will CD approach not affect the brand name of company offering the services? Testing of critical scenarios, scenarios having financial impacts, usability & compliance, few crazy scenarios etc. can’t be left on bots alone and needs to be accompanied by manual testing. I was pursuing these questions, when another question popped up, are there companies who is releasing their code using Continuous Deployment (NOT Continuous Delivery) approach? Below are the list of few (not ALL) companies and their approach of conducting manual testing in CD.

    Continuous Deployment at IMVU: Doing the impossible fifty times a day
    In praise of continuous deployment: The story
    Intercom - Why continuous deployment just keeps on giving
    Etsy - Quantum of Deployment
    WealthFront - DevOps Cafe on Continuous Deployment and Operations Dashboards

    How & when manual testing being conducted?
    In the current world, end user applications are getting complex and dependent on many other services or applications. Testing all scenarios manually is not practical but testing of critical scenarios directly on production (TiP) seems feasible. Does this mean, quality of the service or application is being compromised? No, in reality, (in TiP) the new feature or changes are not required to be visible to end users always as soon as the new build is released to the production. There are ways, when some features or changes can be hidden to end users until validated by testers in the production. Initially, the testers in the company validates the changes and later changes are propagated to Beta users. Upon their satisfaction, feature or changes are released slowly to all users. In the meantime, if defects are found, the release can be rolled backed without affecting ALL users.

    Does this mean, even after releasing to production, it can take long time before changes are propagated to users?
    In CD, the regression testing gets done in almost all the phases. Developer does manual testing before check-in of their code followed by automatic static validation, unit testing and automated regression tests. Build is deployed and validated in multiple environments in an automated fashion prior to deploying in the production environment. In all the phases / environment, monitors are placed, which automatically tests for reliability and other details. There are bots in the test system, which keep on doing the real transactions using real credit cards / debit cards etc. and makes it difficult for defects to get passed to the production. Monitors keep watch on any regression and give alerts in advance. Missed scenarios are added to the automation test suite quickly for the next set of releases.

    As most of the testing getting done in an automated way, only few critical scenarios are left for testing manually and that too it depends on the critically of the feature / changes.

    Why not all services can be deployed using CD?
    In my opinion, CD is comparatively easy to apply for services where defects introduced will not create havoc. There is no loss of life, hazardous to health or directly impact the monetary loss. I see, lot of banks have also started applying the CD but not the features / changes that might involve monetary loss. In all these cases, CD can be adopted but releasing changes to the users can be delayed until confident of the quality. Not all existing design architecture might support the CD but it can be tweaked over a period of time where small changes or new features can coexist with other design elements and can be toggled on-off as per the requirements - like microservices.