Google
 
Data Center Testing & Commissioning
Data Center Testing and Commissioning

End-to-end data center crash test!

Elite 3rd party, vendor independent data center testing & commission units that mobilize and camp on data center site, running rigorous test scripts, using sophisticated equipment and tools, that test and commission every component of the data center with absolute know-how of exact approach to commission a site fault-tolerant, fail-safe and reliable for live operation.

Data Center Testing and Commissioning Services:

Experts will run through the design criteria of the data center and based on this create sophisticated test scripts which will ensure that the final operations of the data center facilities is meeting the business objectives of the design. Full loading will be applied to the data center and every aspect ranging from performance to redundancy will be undergoing rigorous tests. Only when our engineers say you are ready for production you can sleep at night.

data center functional site test
Individual System Functional Site Test:
Functional site test to 2N, N+1 one leg at a time, fault disturbance and monitoring for a period trending, switchboard site point-to-point, main, sub, busbars measurement, SI, protection relay, ACB direct acting, earth loop impendence, PDU testing and verification, load testing, noise test, UPS, earthing grids, signal reference grid, visual inspections, etc.
data center system acceptance test
Individual System Site Acceptance Test:
Mains supply resilience, PQM, UPS on DC heater load, ATS operations, generators synchronization test, alarms activation, smoke detection simulation, gas activation in stages, room integrity testing, and other such integral testing components must all pass successfully to complete a full site acceptance test.
data center level 5 test and commissioning
Fully Integrated Level-5 Test & Commissioning:
Fully integrated functional tests will be used to prove the functionality of the data center against its design parameters. The service will stress test the data center environment and full functional tests scripts will be used to ensure that the redundancy and capacity is meeting the design specs and business needs.
data center test and commission report
Testing & Commissioning Report:
Providing full detailed test scripts, commissioning guides, reports of all conducted testing and commissioning, O&M checks, performance benchmarking, competency report, error reports, critical malfunction reports, design vs buildup misalignment reports and such related report documentation are necessary to effectively record and conclude all testing activities.
 

Data Center Testing and Commissioning Process :

The below four steps cannot comprehensively cover all of the steps in commissioning a data center, but it highlights the four key steps and provides examples from real projects for each. These steps are:

Design review.
Preparation for functional data center testing.
Implementation of the functional data center tests and review of trends and tests.
load bank and testing
data center testing
Obviously, these steps are part of an iterative process that must react to problems uncovered in the field. In our experience, no script can cover all of the contingences that include field installation, control sequences, equipment internal controls and configuration, unit delays and unanticipated issues uncovered in the commissioning process.

Design Review:
Whether acting as a third-party data center commissioning agent or as the engineers of record, we have found a peer review (either internal or external) by someone other than the actual designer to be an important part of the design process.

Having outside eyes take a fresh look at the design often uncovers design contingencies that the designer had not considered. In addition to the normal items under a peer review, a review of a data center must carefully consider failure modes, operation at part load and coordination of controls as discussed in the following paragraphs.
Failure Modes:
Both the designer and reviewer need to consider what will happen on failure of any piece of equipment or support system. Data Center support systems are complex and interwoven: the mechanical, electrical and control systems must be reviewed as a whole since failure in any one may cascade to failures in the others.


For instance, loss of utility power in a data center will cause the emergency generator systems to come on-line.

The mechanical design must consider not only the continuation of cooling throughout the process but also the power-off and restart sequences of the mechanical equipment to prevent overloading the generator when it comes on-line.

Part-Load Operation:
With data centers, consideration of how the equipment unloads is important. Although the computer equipment within the center tends to run close to full load, most centers are phased in with racks being installed in groups over time. Furthermore, many of the assumptions for the expected load density change shortly after the data center becomes operational, or sometimes during construction. Because of this, most data centers are built either with future capacity already installed, or with provisions for future capacity to be added in later construction phases. During startup, systems usually are running at part-load.


The design of the equipment and systems supporting the data center must take into account part-load operation during the initial startup, and (as appropriate) uninterrupted operation as subsequence phases are built out. For initial startup of the data center and each subsequent phase of build-out, the cooling systems must be evaluated for their ability to stay on-line and the provi­sion of redundancy.

data center commisioning
For pumps, fans and compressors (Chillers and DX units) the review should ensure proper unloading controls are specified, and that the part-load operation is well away from the surge regions. To prevent temperature fluctuations and premature equipment failure, compression cooling should operate at the lowest anticipated load levels without excessive cycling. For cooling towers, the reviewer should ensure that the tower cells are designed for the highest and lowest anticipated flows with proper coverage of the fill. Variable speed fans or tower bypass should be considered to keep the condenser water tem­perature stable under low loads.

Controls:
Successful control of data center systems requires careful coordination of the control design for the mechanical  and electrical equipment. A successful power down restart sequence requires the generators to communicate to the mechanical control system the loss of power and subsequent readiness of the generators to accept load. The mechanical and electrical bid documents must be reviewed to ensure that all interconnections and associated sequences are fully specified, communication protocols are matched on each end, and that the scope of work for each contractor is clearly outlined. Even within a trade, care must be taken to coordinate the passage of information between equipment from  various manufacturers.

Another issue to consider is provision of Uninterruptible Power Supply (UPS) power to the control system panels. This generally can be provided at a low incremental cost because the control panels are low wattage devices. Uninterrupted power to the control panels can greatly improve the stability of systems during the power down restart sequences.

Preparation for Functional Data Center Testing:

Functional Data Center Test Scripts:
Functional data center test scripts like the example shown in Figure 2 should be developed for each major piece of equipment, all control reset sequences, all equipment staging, and all anticipated failure scenarios. Note these scripts supplement and don’t replace the system prefunctional tests (like hydronic pressure testing of the chilled water piping), contractor startup or control system data center commissioning (such as the point-to-point verification and testing of sequences).

Load Banks:

The data center generally will not be fully loaded during system startup and data center commissioning. In most cases, the computer systems are installed into the racks during the final phases of construction or just after data center commissioning, but provide little or no heat load for system testing. Load banks typi­cally are used to introduce heat loads and to allow simultaneous data center testing of both electrical and cooling systems.

Renting and operating load banks is costly, and introduce risk—if cooling fails, load banks can quickly overheat a space and potentially trigger sprinkler systems. This means that the time for using load banks and associated operators must be minimized. For large projects, a sufficient number of load banks must be reserved in advance to ensure availability.

data center load banks
The mechanical, electrical and control systems must all be ready to run when load banks arrive and the functional tests are to be run:

The support systems, generators, UPS and power distribution systems must be complete.
Control systems must be programmed and ready to trend equipment operation.
Chillers and associated hydronic systems, as well as computer room air conditioner (CRAC) units, package units, DX systems and any other
  installed equipment must have completed startup and be ready for operation. Chillers will not stay on-line without significant load, so they can be run
  only once load banks are started.

The following infrastructure support equipment should be tested:

Cooling systems, including chillers and all HVAC equipment
CRAC (Computer Room Air-Conditioning) units
All associated switchgear
Emergency diesel generator systems
ATS (Automatic Transfer Switch) equipment

resistive load bank

UPS modules:

Resistive load banks and associated cables / connectors
Static transfer switches
Rotary UPS system, if applicable
Battery banks, breakers and charging systems
Transformers (utility and site)
PDU (Power Distribution Unit) equipment
All distribution electrical panels

Loading Considerations:

All normal switchgear and electrical panels need to be checked under load
   
Test generator leads and emergency source for the automatic transfer switchesunder load
   
Resistive load banks must be attached to the PDU’s and tested with increasing load percentages
   
Each UPS module must be tested independently, including a full load battery test
   
UPS battery connections and individual battery cells should be checked during and after discharge
   
Rotary UPS systems must be checked during operation. Rotary systems utilize the same rectifier technology as static topologies on the front end to create DC current from AC, but use spanning motor-generators to recreate the sine wave on the output
   
Each PDU must be tested on both the preferred and alternate sources as well as in each respective bypass
   
All normal transfers should be verified operable
   
PDU distribution breakers must be checked after they are put into service on thepanel boards
Causes for electrical failure and downtime in data center:

The critical power distribution system takes conditioned power from the UPS and distributes it throughout the facility to individual loads. Most site failures occur in areas where hot electrical work is required and physical maintenance is difficult to perform.

Typical causes for failures include:

Cover slipped while accessing load panel
Overheated breakers tripped unexpectedly
Wires were not physically secured under screws
Screws were not torqued adequately
Wires or circuit breaker handles were dislodged while adjacent work was being performed
Screws were stripped
Insulation was skinned, causing faulted wires
Rotations were reversed
data center power

Infrared Applications for Servers and Server Racks:

Ten percent of all server racks currently in service are too hot to meet industry standards for maximum IT reliability and performance.

“Institute research into computer room cooling indicates 1/3 all perforated tiles are incorrectly located and 60% of all available cooling capacity is being wasted by bypass airflow. Increasing under-floor static pressure to get air where it needs to go requires permanently blocking all unnecessary air escape routes. This includes sealing cable cutouts behind and underneath products or racks (this unmanaged airflow is what is really cooling most computer rooms) as well as the penetrations in the floor or walls or ceiling and any other openings in the raised floor. Perforated floor tiles with 25% openings can be replaced with 40% and 60% grates to permit a much higher airflow. For sites with unused raised floor space deliberately spreading equipment out to create white space and reduce the averaged gross watts per square foot power consumption will be a viable option.”
server infrared Server Infrared applications include:

Thermally mapping complete data center from sub-floor to ceiling
   
Verifying proper hot aisle / cold aisle operation, preventing short circuiting and bypassing of air flow
   
Verifying high density server farm cooling capabilities
   
Monitoring server rack temperature distribution patterns
   
Finding internal server fans which are inoperable or damaged

Safety Considerations:
Of course, the thermographer must comply with all OSHA and NFPA 70E regulations. The good news is that unlike most industrial sites, the switchgear rooms and data centers have controlled temperatures and low humidity, which makes the use of the arch flash suits and associated safety equipment much less onerous for the thermographer.


How does a thermographer become “Qualified” and obtain contracts to do Data Center Thermal Survey Work?

First, the thermographer must understand the critical nature of the equipment being tested as well as the surrounding equipment. Furthermore, he/she should understand that the work he/she is performing is critical and vital to the operation. A thermographer wanting to do this type of work should get general training and certification on electrical switchgear and also get specific training on data center equipment. He/She should contact UPS vendors and their clients and cultivate relationships with them.Since this work has a high accountability, the methodology for performing the surveys and creating the reports must be “upgraded” from the typical office building or factory.
data center safety
This means the thermographer must use a high resolution, radiometric and sensitive thermal imager and learn how to record all thermal, visual and textual data by using a detailed data logging system. Also, data center specific work schedules often include nighttime maintenance windows from Saturday midnight until Sunday morning. Therefore, the thermographer must get used to working during off-peak times.

We know that large companies commission all data center equipment but, do the smaller companies have UPS and server systems? Absolutely! In order to successfully complete the data center commissioning process and maintain the systems, large and small companies must find thermographers that are close by and have experience in critical facility activities.

Coordination of the Trades:
Preparation for these tests is a multidisciplinary effort that requires input from both the design and construction teams.

 
The functional tests cover not only individual pieces of equipment but also the entire integrated mechanical, electrical and control systems. Designers and contractors of these systems should provide input to the script well in advance of the scheduled data center commissioning dates.

We have found that holding regular data center commissioning meet­ings for several months before actual data center commissioning serves to raise awareness of critical issues in all of the team members. Engineers, owners, contractors and equipment vendors can use these meetings to agree on sequence of events, coordination of schedules and responsibilities of key players.

Scheduling considerations should include these milestones:

A date for power to all the mechanical and control systems
   
A final date for precommissioning of all systems, which includes the typical startup for each piece of equipment and testing of all control system wiring, I/O point status, programming, con­figuration of alarms and configuration of trending.The trends and alarms must be active during the functional tests.
   
Arrival of load banks and duration of data center testing. Expected time to perform all testing, including:
  - Part-load and full-load (where possible) operation;
  - Sequenced failure of equipment, restart and return to stable operation. This should be done for every piece of equipment.
  - Automatic transfer switch (ATS) and UPS operation;
  - Generator operation; and
  - Complete power-down and automatic restart.
   
Startup, prefunctional data center testing and functional data center testing of each system. The mechanical and controls contractors need to care­fully coordinate the testing of their systems with the electrical contractors to ensure uninterrupted power is available during their tests.All three trades are typically testing their systems simultaneously.
   
Contingency time reserved for correction of errors and retesting

Execution of Functional Data Center Tests:
During the development of the functional test scripts a commissioning log should be created from the script that can be used to record the event. This log should record the events that occur during data center testing in parallel with automatic monitoring and trend logs from control systems. Recorded information should include: date and time of data center testing, the participants, and the expected and actual outcomes for each test.

During data center testing, frequently unexpected results occur due to system attributes that were overlooked during design or introduced during construction. It is important to have the right people on hand for every test so they can witness any failures and participate in the remediation and retesting. This includes the technicians from the associated trades, project managers, and the appropriate owner’s representatives. This deployment needs to be carefully coordinated and scheduled well in advance.

For larger equipment such as Chillers or UPS systems, factory representatives should also be on site for the tests. You often dis­cover during the tests that reprogramming or reconfiguration of a piece of equipment is required to pass the test. For example, in one project the control contractor had landed the remote chilled water reset and demand limit signals on the chiller’s control panel, but the chiller had not been programmed to look at the remote control inputs. Having the right people on site during the tests can significantly reduce the time for remediation and retesting.


Review of Trends and Tests:
After functional data center testing is complete, the control system trend logs and error logs should be thoroughly analyzed. Many inter­related actions are only clear through review of trends since only a small fraction of system settings can be observed real-time during the functional test. The loss of a chiller for example may have been caused by the loss of its chilled or condenser water pump. Since all three fail at once, the root cause may only become clear through review of the control system trends and the error logs from the chiller control panel and the variable speed drives.