Securely Operating the Enterprise Cloud

In our last blog, we examined how we treat our customers’ data as if it was our own.  While this is a critical component of being secure in the cloud, my experience with our global customers helps me understand that cloud security requires a multi-layered approach. Industry leaders who depend on the cloud to run their business need to know that beyond having their data secure and that the cloud is physically secure, that there are strict operational controls and well thought out processes and procedures for when security events inevitably occur.

To start, a reminder that we run a highly redundant and secure cloud infrastructure built on a multi-instance architecture in which every customer instance has its own application logic and database. We have redundant routers, switches, firewalls and server load-balancers for every customer instance. We add additional security to this setup with intrusion detection systems (IDS) and distributed denial-of-service (DDOS) protection at each of our global locations to quickly detect, alert and remediate suspicious events. We run a Linux kernel with an enhanced security module as our operating system on the servers that run application Cloud Security1.jpglogic in Java virtual machines. We build our infrastructure like an enterprise would build it – we don’t optimize for cost but rather focus on real availability, security and performance. All of this infrastructure is the baseline for operating a secure cloud.

Physical Security

Beyond secure infrastructure, we operate in facilities that are similar to those that our customers would chose for their own environments. This aspect of operational security starts with the physical security at each of our datacenter locations. Each location uses multiple physical security methods including purpose-built buildings, 24×7 security guards, man-traps to enter the facility, biometrics scanners (palm and fingerprint) and so forth. It’s very difficult to get into any datacenter facility where we deploy our cloud, even when authorized to do so. Further, we only use own full-time and screened personnel to work in our datacenter locations. We don’t use contractors or third-party smart-hands to install or perform break-fix operations. Our customers’ data is our data and we only want our personnel with physical access.

Secure Controls and Logging

Like protecting physical access to an apartment building is only part of overall building security, another critical factor is controlling who has the keys to the building, who can disarm the theft alarm, who can access the visitor logs, etc. On the Enterprise Cloud, we have strict controls on who can access the underlying network and server infrastructure. Access requires a secure virtual private network (VPN) connection using multi-factor authentication (MFA) and one-time passwords. Also, very few individuals in the company have read-write access to infrastructure devices. If an engineer wants to make a change on a device, he or she must have strictly followed our change management process and is then granted read-write access by our 24×7 Site Reliability Engineering (SRE) team for time of the change window. The SRE team grants the engineer clearance-to-proceed for his or her change and when the change is complete, the read-write access for the engineer is revoked. ServiceNow technicians may need temporary access to an instance to troubleshoot an issue – similar to loaning an apartment key to family members when they come to visit. In a similar fashion, the Enterprise Cloud gives customers full control of all technician logins (including completely prohibiting technician access without customer approval).

Similar to how an apartment building with a door has visitor logs, a ServiceNow instance has full audit logs of all login access and all transactions on the instance (including any efforts to delete logs). The collection of accurate logs and audit information is required for the secure operation and any potential forensic analysis. We continuously audit the Enterprise Cloud and have customers who audit our security and operations on a regular basis.

Security Incident Handling Process

Personnel securing apartment buildings or other facilities train on how to respond to a security incident. If an incident occurs, who calls the police, who is in control, who communicates the issue and so on. This process is as important to security operations as are fences, cameras and locked doors. It is no different on the Enterprise Cloud and we have a defined security incident handling process staffed by a global response team. This team has defined roles and responsibilities with workflows defined for detecting, triaging, investigating, communicating and resolving any incidents. This process gets tested regularly to ensure that if an incident occurs the process goes smoothly. Our customers can use the ServiceNow Security Suite to help their security personnel build a similar security incident response process.

Operating the Enterprise Cloud securely takes multiple layers of effort. We employ rigorous physical security, strict controls on who can log in and make changes to our infrastructure, give our customers full access control and have a wealth of logs and audit trails. Our customers demand these multiple layers of security and operational controls combined with a well-defined security incident handling process. We believe that these are the necessary features and procedures for enterprises to operate securely in the cloud.

Stay tuned for our next post in the series coming soon!

mm
Allan Leinwand
Allan Leinwand has built a reputation for managing the world’s most demanding clouds – in B2B and B2C. He is the chief technology officer at ServiceNow responsible for building and running the ServiceNow Enterprise Cloud – the second largest enterprise cloud computing environment on the planet. In this role, he is responsible for overseeing all technical aspects and guiding the long-term technology strategy for the company. Before joining ServiceNow, Leinwand was chief technology officer – Infrastructure at Zynga, Inc. where he was focused on building one of the largest consumer cloud computing environments used in the delivery of the company’s social games to more than 80 million players daily. He got his start as a cloud pioneer at Cisco before “cloud computing” was a term and the idea of accessing applications from anywhere was still very new. In addition to expertise in running large enterprise cloud computing environments, he also provides expertise in software engineering, quality engineering and product-market fit to companies including Spoke, Inc.; Bulletproof 360, Inc.; MapAnything, Inc.; Founders Circle Capital; and Kleiner Perkins Caufield & Byers. He is a Board member of Marin Software. Leinwand has served as an adjunct professor at the University of California, Berkeley where he taught computer networks, network management and network design. He holds a bachelor of science degree in computer science from the University of Colorado at Boulder.

Leave a Reply Text

Your email address will not be published.

Shares