Vol. 1, No. 3, Fall 2005

 

Enhanced performance and reliability

ILS has been growing at a nice, steady pace for the last four years. As some of you already know, we monitor the performance of each of our servers and processing workstations. If there is anything wrong, an ILS administrator is paged within 4 minutes and starts to take corrective measures.

In the spring of 2005, we saw that that was not enough. We wanted to improve our system so that we were able to prevent some of the problems from happening. We identified the three most troublesome areas. These areas were:

  • availability of Marketplace data at our web hosting company
  • availability of Marketplace Processing Center network connectivity
  • availability of Marketplace data at our game processing workstations

To increase the reliability of our service, we targeted these three areas and introduced several infrastructure updates during the summer of 2005.

Database data replication

We saw the loss of availability of Marketplace data at our web hosting company as a threat to providing reliable service to our customers. We knew that if our database server went down and we would not be able to recover it within the next hour, our service would suffer. To avoid this condition, we decided to introduce data replication at the database server level.

To achieve this, we purchased another database server this past summer. We now have two database servers: a master database server and a slave database server. In this setup, the slave is instantly replicating data from the master. So, in the unlikely event that the master database server goes down, we are able to swap the servers in less than 15 minutes and quickly restore the Web Marketplace service.

Having all Marketplace data stored on two database servers increases our reliability and survivability during a serious problem. It also enables us to continue providing service to our customers with only a short interruption of our service.

Internet connection in the Marketplace Processing Center

The next area that required our attention was the Marketplace Processing Center's Internet connectivity. We have had Internet connectivity via two independent providers. However, there were some changes at our backup Internet service provider that could cause the deterioration of their service. We found another company (microcerv.net) to replace our backup Internet provider.

At the same time, we upgraded the routers and switches that we use to maintain Internet connectivity to our two independent Internet providers. Our current setup enables us to switch from our primary Internet provider to our backup Internet provider in less than 60 seconds.

Independently, our primary Internet provider (ISDN.net) upgraded their equipment and service. They have access to three Internet backbones here in the Digital Crossing building, where the Marketplace Processing Center resides.

After these upgrades, we have access to two Internet providers and four Internet backbones. Via our primary Internet provider ISDN.net, we have access to WV Fiber, BellSouth, and AT&T networks. Through our backup Internet provider, we have access to MCI network.

These upgrades should significantly increase reliability of our Internet connection in the Marketplace Processing Center.

Processing workstations' upgrade to RAID1

We have several unmanned game processing workstations whose sole purpose is to process games 24x7, 365 days year after year. The loss of any of these workstations leads to the interruption of service. In such a case, we need to restore the data from backup on another workstation, which can take up to 12 hours. Therefore, a hard disk failure was identified as the most likely event that would make a game processing workstation inoperable.

To minimize risks associated with the game processing workstation being inoperable due to hard disk failure, we upgraded all of our current game processing workstations to RAID1 level and bought three new game processing workstations with RAID1 setup. RAID1 enables us to process games on the game processing workstation even if one of the hard drives fails. These adjustments have made a hard drive failure a non-issue. Eventual replacement of the failed hard drive will be done at a later time when the game processing workstation is not engaged in data processing. The whole operation will take less than an hour.

For the technically inclined, RAID is a redundant array of independent disks. In our RAID configuration we use the RAID1 setup, where data is mirrored on both drives. So, the failure of one drive does not lead to the computer's inoperability.

Configuring all of our game processing workstation with RAID1 setup enables us to process games even if one of the game processing workstation hard drives fails. This is important because for many of you who are running condensed games spread over only a few days, fast game processing is very important. It is very important for us too, because we shorten the time we need to spend on fixing a hardware problem.

Conclusion

With these three infrastructure upgrades (Database data replication, Internet connection in the Marketplace Processing Center, Processing workstations' upgrade to RAID1), we have significantly improved the reliability of our service. This will enable us to spend more time developing Web Marketplace and less time fixing hardware problems. And that is exactly our goal.

back to top
 

Previous Article

 

Table of Contents

 

Next Article

 

Marketplace Community Newsletter, issued quarterly.

www.marketplace-simulation.com
Copyright © 2005 Innovative Learning Solutions. All rights reserved.
Innovative Learning Solutions, Inc., 500 West Summit Hill, Knoxville, Tennessee 37902, USA
Phone: 865.740.1776