-
Travis Desell,
David P. Anderson,
Malik Magdon-Ismail,
Heidi Newberg,
Boleslaw Szymanski,
and Carlos A. Varela.
An Analysis of Massively Distributed Evolutionary Algorithms.
In Proceedings of the 2010 International Conference on Evolutionary Computation (IEEE CEC 2010),
Barcelona, Spain,
pages 1-8,
July 2010.
Keyword(s): distributed computing,
evolutionary algorithms,
volunteer computing.
Abstract:
Computational science is placing new demands on optimization algorithms as the size of data sets and the computational complexity of scientific models continue to increase. As these complex models have many local minima, evolutionary algorithms (EAs) are very useful for quickly finding optimal solutions in these challenging search spaces. In addition to the complex search spaces involved, calculating the objective function can be extremely demanding computationally. Because of this, distributed computation is a necessity. In order to address these computational demands, top-end distributed computing systems are surpassing hundreds of thousands of computing hosts; and as in the case of Internet based volunteer computing systems, they can also be highly heterogeneous and faulty. This work examines asynchronous strategies for distributed EAs using simulated computing environments. Results show that asynchronous EAs can scale to hundreds of thousands of computing hosts while being highly resilient to heterogeneous and faulty computing environments, something not possible for traditional distributed EAs which require synchronization. While the simulation not only provides insight as to how asynchronous EAs perform on distributed computing environments with different latencies and heterogeneity, it also serves as a sanity check because live distributed systems require problems with high computation to communication ratios and traditional benchmark problems cannot be used for meaningful analysis due to their short computation times. |
@inproceedings{desell-massively-distributed-eas-2010,
author = "Travis Desell and David P. Anderson and Malik Magdon-Ismail and Heidi Newberg and Boleslaw Szymanski and Carlos A. Varela",
title = "An Analysis of Massively Distributed Evolutionary Algorithms",
booktitle = "Proceedings of the 2010 International Conference on Evolutionary Computation (IEEE CEC 2010)",
address = "Barcelona, Spain",
year = "2010",
pages = "1--8",
month = "July",
keywords = "distributed computing, evolutionary algorithms, volunteer computing",
pdf = "http://wcl.cs.rpi.edu/papers/2010_cec.pdf",
abstract = {Computational science is placing new demands on optimization algorithms as the size of data sets and the computational complexity of scientific models continue to increase. As these complex models have many local minima, evolutionary algorithms (EAs) are very useful for quickly finding optimal solutions in these challenging search spaces. In addition to the complex search spaces involved, calculating the objective function can be extremely demanding computationally. Because of this, distributed computation is a necessity. In order to address these computational demands, top-end distributed computing systems are surpassing hundreds of thousands of computing hosts; and as in the case of Internet based volunteer computing systems, they can also be highly heterogeneous and faulty. This work examines asynchronous strategies for distributed EAs using simulated computing environments. Results show that asynchronous EAs can scale to hundreds of thousands of computing hosts while being highly resilient to heterogeneous and faulty computing environments, something not possible for traditional distributed EAs which require synchronization. While the simulation not only provides insight as to how asynchronous EAs perform on distributed computing environments with different latencies and heterogeneity, it also serves as a sanity check because live distributed systems require problems with high computation to communication ratios and traditional benchmark problems cannot be used for meaningful analysis due to their short computation times.}
}
-
Travis Desell,
Malik Magdon-Ismail,
Boleslaw Szymanski,
Carlos A. Varela,
Heidi Newberg,
and David P. Anderson.
Validating Evolutionary Algorithms on Volunteer Computing Grids.
In Proceedings of the 10th International Conference on Distributed Applications and Interoperable Systems,
Amsterdam, Netherlands,
pages 29-41,
June 2010.
Keyword(s): distributed computing,
evolutionary algorithms,
volunteer computing,
validation,
scientific computing.
Abstract:
Computational science is placing new demands on distributed computing systems as the rate of data acquisition is far outpacing the improvements in processor speed. Evolutionary algorithms provide efficient means of optimizing the increasingly complex models required by different scientific projects, which can have very complex search spaces with many local minima. This work describes different validation strategies used by MilkyWay@Home, a volunteer computing project created to address the extreme computational demands of 3-dimensionally modeling the Milky Way galaxy, which currently consists of over 27,000 highly heterogeneous and volatile computing hosts, which provide a combined computing power of over 1.55 petaflops. The validation strategies presented form a foundation for efficiently validating evolutionary algorithms on unreliable or even partially malicious computing systems, and have significantly reduced the time taken to obtain good fits of MilkyWay@Home’s astronomical models. |
@inproceedings{desell-validating-eas-2010,
author = "Travis Desell and Malik Magdon-Ismail and Boleslaw Szymanski and Carlos A. Varela and Heidi Newberg and David P. Anderson",
title = "Validating Evolutionary Algorithms on Volunteer Computing Grids",
booktitle = "Proceedings of the 10th International Conference on Distributed Applications and Interoperable Systems",
address = "Amsterdam, Netherlands",
year = "2010",
pages = "29--41",
month = "June",
keywords = "distributed computing, evolutionary algorithms, volunteer computing, validation, scientific computing",
pdf = "http://wcl.cs.rpi.edu/papers/2010_dais.pdf",
abstract = {Computational science is placing new demands on distributed computing systems as the rate of data acquisition is far outpacing the improvements in processor speed. Evolutionary algorithms provide efficient means of optimizing the increasingly complex models required by different scientific projects, which can have very complex search spaces with many local minima. This work describes different validation strategies used by MilkyWay@Home, a volunteer computing project created to address the extreme computational demands of 3-dimensionally modeling the Milky Way galaxy, which currently consists of over 27,000 highly heterogeneous and volatile computing hosts, which provide a combined computing power of over 1.55 petaflops. The validation strategies presented form a foundation for efficiently validating evolutionary algorithms on unreliable or even partially malicious computing systems, and have significantly reduced the time taken to obtain good fits of MilkyWay@Home’s astronomical models.}
}
-
Gustavo A. Guevara,
Travis Desell,
Jason LaPorte,
and Carlos A. Varela.
Modular Visualization of Distributed Systems.
In XXXVI Conferencia Latinoamericana de Informática,
Asunción, Paraguay,
October 2010.
Keyword(s): distributed computing,
distributed systems visualization.
Abstract:
Effective visualization is critical to developing, analyzing, and optimizing distributed systems. We have developed OverView, a tool for online/offline distributed systems visualization, that enables modular layout mechanisms, so that different distributed system high-level programming abstractions such as actors or processes can be visualized in intuitive ways. OverView uses by default a hierarchical concentric layout that distinguishes entities from containers allowing migration patterns triggered by adaptive middleware to be visualized. In this paper, we develop a force-directed layout strategy that connects entities according to their communication patterns in order to directly exhibit the application communication topologies. In force-directed visualization, entities’ locations are encoded with different colors to illustrate load balancing. We compare these layouts using quantitative metrics including communication to entity ratio, applied on common distributed application topologies. We conclude that modular visualization is necessary to effectively visualize distributed systems since no one layout is best for all applications. |
@InProceedings{guevara-dsvisualization-clei-2010,
author = {Gustavo A. Guevara and Travis Desell and Jason LaPorte and Carlos A. Varela},
title = {Modular Visualization of Distributed Systems},
booktitle = {XXXVI Conferencia Latinoamericana de Inform\'{a}tica},
year = 2010,
address = {Asunci\'{o}n, Paraguay},
month = {October},
pdf = {http://wcl.cs.rpi.edu/papers/clei2010dsv.pdf},
keywords = {distributed computing, distributed systems visualization},
abstract = {Effective visualization is critical to developing, analyzing, and optimizing distributed systems. We have developed OverView, a tool for online/offline distributed systems visualization, that enables modular layout mechanisms, so that different distributed system high-level programming abstractions such as actors or processes can be visualized in intuitive ways. OverView uses by default a hierarchical concentric layout that distinguishes entities from containers allowing migration patterns triggered by adaptive middleware to be visualized. In this paper, we develop a force-directed layout strategy that connects entities according to their communication patterns in order to directly exhibit the application communication topologies. In force-directed visualization, entities’ locations are encoded with different colors to illustrate load balancing. We compare these layouts using quantitative metrics including communication to entity ratio, applied on common distributed application topologies. We conclude that modular visualization is necessary to effectively visualize distributed systems since no one layout is best for all applications.}
}
-
J. Vickers,
H. Newberg,
N. Cole,
T. Desell,
B. Szymanski,
M. Magdon-Ismail,
and C. Varela.
The Effect of Orientation in Cross Sectional Studies.
In American Astronomical Society Meeting Abstracts #215,
volume 42 of Bulletin of the American Astronomical Society,
pages #413.21,
January 2010.
Note: Provided by the SAO/NASA Astrophysics Data System.
Keyword(s): distributed computing,
astroinformatics.
Abstract:
Recent studies of the Milky Way halo have shown that there are many tidal debris streams and other substructures that can be detected from the spatial distributions of halo stars. We are attempting to describe the spatial structure through maximum likelihood fitting of a smoothly varying component plus a set of additional components that describe the tidal debris. The Sagittarius tidal debris stream in particular is modeled by a set of piecewise linear cylinders with a density that falls of as a Gaussian from the central axis of the cylinder. We show the highest likelihood fit to the density of SDSS F turnoff stars along the Sagittarius stream, and the results of a test of the sensitivity of the likelihood fits to the angle between the stream direction and the angle at which the data is sliced into pieces. |
@INPROCEEDINGS{2010AAS...21541321V,
author = {{Vickers}, J. and {Newberg}, H. and {Cole}, N. and {Desell}, T. and {Szymanski}, B. and {Magdon-Ismail}, M. and {Varela}, C.},
title = "{The Effect of Orientation in Cross Sectional Studies}",
booktitle = {American Astronomical Society Meeting Abstracts \#215},
year = 2010,
series = {Bulletin of the American Astronomical Society},
volume = 42,
month = jan,
pages = {#413.21},
url = {http://adsabs.harvard.edu/abs/2010AAS...21541321V},
note = {Provided by the SAO/NASA Astrophysics Data System},
keywords = {distributed computing, astroinformatics},
abstract = {Recent studies of the Milky Way halo have shown that there are many tidal debris streams and other substructures that can be detected from the spatial distributions of halo stars. We are attempting to describe the spatial structure through maximum likelihood fitting of a smoothly varying component plus a set of additional components that describe the tidal debris. The Sagittarius tidal debris stream in particular is modeled by a set of piecewise linear cylinders with a density that falls of as a Gaussian from the central axis of the cylinder. We show the highest likelihood fit to the density of SDSS F turnoff stars along the Sagittarius stream, and the results of a test of the sensitivity of the likelihood fits to the angle between the stream direction and the angle at which the data is sliced into pieces.}
}
-
Ping Wang,
Wei Huang,
and Carlos A. Varela.
Impact of Virtual Machine Granularity on Cloud Computing Workloads Performance.
In Workshop on Autonomic Computational Science (ACS'2010),
Brussels, Belgium,
pages 393-400,
October 2010.
Keyword(s): distributed computing,
cloud Computing,
granularity,
malleability,
performance,
virtual machine.
Abstract:
This paper studies the impact of VM granularity on workload performance in cloud computing environments. We use HPL as a representative tightly coupled computational workload and a web server providing content to customers as a representative loosely coupled network intensive workload. The performance evaluation demonstrates VM granularity has a significant impact on the performance of the computational workload. On an 8-CPU machine, the performance obtained from utilizing 8VMs is more than 4 times higher than that given by 4 or 16 VMs for HPL of problem size 4096; whereas on two machines with a total of 12 CPUs 24 VMs gives the best performance for HPL of problem sizes from 256 to 1024. Our results also indicate that the effect of VM granularity on the performance of the web system is not critical. The largest standard deviation of the transaction rates obtained from varying VM granularity is merely 2.89 with a mean value of 21.34. These observations suggest that VM malleability strategies where VM granularity is changed dynamically, can be used to improve the performance of tightly coupled computational workloads, whereas VM consolidation for energy savings can be more effectively applied to loosely coupled network intensive workloads. |
@InProceedings{wangp-vmg-acs-2010,
author = {Ping Wang and Wei Huang and Carlos A. Varela},
title = {Impact of Virtual Machine Granularity on Cloud Computing Workloads Performance},
booktitle = {Workshop on Autonomic Computational Science (ACS'2010)},
year = 2010,
pages = "393-400",
address = {Brussels, Belgium},
month = {October},
pdf = "http://wcl.cs.rpi.edu/papers/wangp-vmg-acs-2010.pdf",
keywords = {distributed computing, cloud Computing, granularity, malleability, performance, virtual machine},
abstract = {This paper studies the impact of VM granularity on workload performance in cloud computing environments. We use HPL as a representative tightly coupled computational workload and a web server providing content to customers as a representative loosely coupled network intensive workload. The performance evaluation demonstrates VM granularity has a significant impact on the performance of the computational workload. On an 8-CPU machine, the performance obtained from utilizing 8VMs is more than 4 times higher than that given by 4 or 16 VMs for HPL of problem size 4096; whereas on two machines with a total of 12 CPUs 24 VMs gives the best performance for HPL of problem sizes from 256 to 1024. Our results also indicate that the effect of VM granularity on the performance of the web system is not critical. The largest standard deviation of the transaction rates obtained from varying VM granularity is merely 2.89 with a mean value of 21.34. These observations suggest that VM malleability strategies where VM granularity is changed dynamically, can be used to improve the performance of tightly coupled computational workloads, whereas VM consolidation for energy savings can be more effectively applied to loosely coupled network intensive workloads.}
}
-
Wei-Jen Wang,
Carlos Varela,
Fu-Hau Hsu,
and Cheng-Hsien Tang.
Actor Garbage Collection Using Vertex-Preserving Actor-to-Object Graph Transformations.
In Advances in Grid and Pervasive Computing,
volume 6104 of Lecture Notes in Computer Science,
Bologna,
pages 244-255,
May 2010.
Springer Berlin / Heidelberg.
Keyword(s): concurrent programming,
actor model,
garbage collection,
internet programming languages.
Abstract:
Large-scale distributed computing applications require concurrent programming models that support modular and compositional software development. The actor model supports the development of independent software components with its asynchronous message-passing communication and state encapsulation properties. Automatic actor garbage collection is necessary for high-level actor-oriented programming, but identifying live actors is not as intuitive and easy as identifying live passive objects in a reference graph. However, a transformation method can turn an actor reference graph into a passive object reference graph, which enables the use of passive object garbage collection algorithms and simplifies the problem of actor garbage collection. In this paper, we formally define potential communication by introducing two binary relations - the may-talk-to and the may-transitively-talk-to relations, which are then used to define the set of live actors. We also devise two vertex-preserving transformation methods to transform an actor reference graph into a passive object reference graph. We provide correctness proofs for the proposed algorithms. The experimental results also show that the proposed algorithms are efficient. |
@inproceedings{wang-agc2010,
title = "Actor Garbage Collection Using Vertex-Preserving Actor-to-Object Graph Transformations",
author = "Wei-Jen Wang and Carlos Varela and Fu-Hau Hsu and Cheng-Hsien Tang",
booktitle = "Advances in Grid and Pervasive Computing",
address = "Bologna",
series = "Lecture Notes in Computer Science",
volume = 6104,
publisher = "Springer Berlin / Heidelberg",
month = May,
year = 2010,
pages = {244--255},
pdf = "http://wcl.cs.rpi.edu/papers/GPC10-1569275509.pdf",
keywords = "concurrent programming, actor model, garbage collection, internet programming languages",
abstract = {Large-scale distributed computing applications require concurrent programming models that support modular and compositional software development. The actor model supports the development of independent software components with its asynchronous message-passing communication and state encapsulation properties. Automatic actor garbage collection is necessary for high-level actor-oriented programming, but identifying live actors is not as intuitive and easy as identifying live passive objects in a reference graph. However, a transformation method can turn an actor reference graph into a passive object reference graph, which enables the use of passive object garbage collection algorithms and simplifies the problem of actor garbage collection. In this paper, we formally define potential communication by introducing two binary relations - the may-talk-to and the may-transitively-talk-to relations, which are then used to define the set of live actors. We also devise two vertex-preserving transformation methods to transform an actor reference graph into a passive object reference graph. We provide correctness proofs for the proposed algorithms. The experimental results also show that the proposed algorithms are efficient.}
}