-
Wei-Jen Wang.
Distributed Garbage Collection for Large-Scale Mobile Actor Systems.
PhD thesis,
Rensselaer Polytechnic Institute,
2006.
Keyword(s): distributed computing,
grid computing,
internet programming languages,
software agents.
Abstract:
Distributed actor garbage collection (GC) is a notoriously hard problem due to the nature of distributed actor systems — no common clock, no coherent global information, asynchronous and unordered message passing, autonomous behavior of actors, and counter-intuitive actor marking to identify live actors. Most existing distributed actor GC algorithms rely on First-In-First-Out (FIFO) communication, which constrains the actor model and is impractical with actor mobility; others depend on stop-the-world synchronization, which is intrusive and impractical for users’ computations. Existing actor GC algorithms ignore actor mobility and resource access restrictions. Existing distributed passive object GC algorithms cannot be directly reused because of the different semantics of passive objects and actors. To overcome the problems that existing algorithms cannot solve, this thesis presents a practical actor GC mechanism for distributed mobile actor systems. Our approach starts formalizing garbage actors, and then we show two different but similar transformation methods that prove the equivalence of actor GC and passive object GC. Two actor marking algorithms are derived from the transformation methods — the back pointer algorithm and the N-color algorithm. The back pointer algorithm has linear time complexity of O(E + V ) and extra space complexity of O(E + V ), and the N-color algorithm has time complexity of O(E lg∗ M) and extra space complexity of O(M), given that E is the number of references, V , the number of actors, and M, the number of unblocked and root actors. The N-color algorithm only requires scanning the reference graph once while the back pointer algorithm requires scanning it twice.The thesis follows by describing our distributed mobile actor GC mechanism. It consists of 1) an asynchronous, non-FIFO reference listing based algorithm which supports hierarchical GC (local and global GC), 2) a new fault-tolerant, distributed snapshot-based algorithm which collects cyclic and acyclic garbage in a partial set of computing hosts, and 3) formal models and correctness proofs. Experimental results have confirmed that our approach is practical and scalable. |
@PhdThesis{wang-phd-2006,
author = {{Wei-Jen} Wang},
title = {Distributed Garbage Collection for Large-Scale Mobile Actor Systems},
school = {Rensselaer Polytechnic Institute},
year = 2006,
pdf = {http://wcl.cs.rpi.edu/theses/wangweijenphd.pdf},
keywords = {distributed computing, grid computing, internet programming languages, software agents},
abstract = {Distributed actor garbage collection (GC) is a notoriously hard problem due to the nature of distributed actor systems — no common clock, no coherent global information, asynchronous and unordered message passing, autonomous behavior of actors, and counter-intuitive actor marking to identify live actors. Most existing distributed actor GC algorithms rely on First-In-First-Out (FIFO) communication, which constrains the actor model and is impractical with actor mobility; others depend on stop-the-world synchronization, which is intrusive and impractical for users’ computations. Existing actor GC algorithms ignore actor mobility and resource access restrictions. Existing distributed passive object GC algorithms cannot be directly reused because of the different semantics of passive objects and actors. To overcome the problems that existing algorithms cannot solve, this thesis presents a practical actor GC mechanism for distributed mobile actor systems. Our approach starts formalizing garbage actors, and then we show two different but similar transformation methods that prove the equivalence of actor GC and passive object GC. Two actor marking algorithms are derived from the transformation methods — the back pointer algorithm and the N-color algorithm. The back pointer algorithm has linear time complexity of O(E + V ) and extra space complexity of O(E + V ), and the N-color algorithm has time complexity of O(E lg∗ M) and extra space complexity of O(M), given that E is the number of references, V , the number of actors, and M, the number of unblocked and root actors. The N-color algorithm only requires scanning the reference graph once while the back pointer algorithm requires scanning it twice.The thesis follows by describing our distributed mobile actor GC mechanism. It consists of 1) an asynchronous, non-FIFO reference listing based algorithm which supports hierarchical GC (local and global GC), 2) a new fault-tolerant, distributed snapshot-based algorithm which collects cyclic and acyclic garbage in a partial set of computing hosts, and 3) formal models and correctness proofs. Experimental results have confirmed that our approach is practical and scalable.}
}
-
Kaoutar El Maghraoui,
Travis Desell,
Boleslaw K. Szymanski,
and Carlos A. Varela.
The Internet Operating System: Middleware for Adaptive Distributed Computing.
International Journal of High Performance Computing Applications (IJHPCA), Special Issue on Scheduling Techniques for Large-Scale Distributed Platforms,
20(4):467-480,
2006.
Keyword(s): distributed computing,
grid computing,
middleware.
Abstract:
Large-scale, dynamic, and heterogeneous networks of computational resources (a.k.a. grids) promise to provide high performance and scalability to computationally intensive applications. To fulfill this promise, grid environments require complex resource management. We propose decentralized middleware-triggered dynamic reconfiguration strategies to enable application adaptation to the constantly changing resource availability of Internet-scale shared computational grids. As a proof of concept, we present a software framework for dynamically reconfigurable distributed applications. The Internet Operating System (IOS) is a middleware infrastructure which aims at freeing application developers from dealing with non-functional concerns while seeking to optimize application performance and global resource utilization. IOS consists of distributed middleware agents that are capable of interconnecting themselves in various virtual peer-to-peer topologies. IOS middleware agents: 1) profile application communication patterns; 2) evaluate the dynamics of the underlying physical resources; and 3) reconfigure application components by changing their mappings to physical resources through migration and by changing their granularity through a split and merge mechanism. A key characteristic of IOS is its decentralized coordination, thereby avoiding the use of global knowledge and thus enabling scalable reconfiguration. The IOS middleware is programming model-independent: we have implemented an actor programming model interface for SALSA programs and also a process programming model interface for MPI programs. Experimental results show that adaptive middleware can be an effective approach to reconfiguring distributed applications with various ratios of communication to computation in order to improve their performance, and more effectively utilize grid resources. |
@Article{elmaghraoui-ios-ijhpca-2006,
author = {Kaoutar El Maghraoui and Travis Desell and Boleslaw K. Szymanski and Carlos A. Varela},
editor = {Larry Carter and Henri Casanova and Fr\`{e}d\`{e}ric Desprez and Jeanne Ferrante and Yves Robert},
title = {The {Internet Operating System}: Middleware for Adaptive Distributed Computing},
journal = {International Journal of High Performance Computing Applications (IJHPCA), Special Issue on Scheduling Techniques for Large-Scale Distributed Platforms},
year = {2006},
publisher = {SAGE Publications},
volume = {20},
number = {4},
pages = {467-480},
pdf = {http://wcl.cs.rpi.edu/papers/ijhpca06.pdf},
keywords = {distributed computing, grid computing, middleware},
abstract = {Large-scale, dynamic, and heterogeneous networks of computational resources (a.k.a. grids) promise to provide high performance and scalability to computationally intensive applications. To fulfill this promise, grid environments require complex resource management. We propose decentralized middleware-triggered dynamic reconfiguration strategies to enable application adaptation to the constantly changing resource availability of Internet-scale shared computational grids. As a proof of concept, we present a software framework for dynamically reconfigurable distributed applications. The Internet Operating System (IOS) is a middleware infrastructure which aims at freeing application developers from dealing with non-functional concerns while seeking to optimize application performance and global resource utilization. IOS consists of distributed middleware agents that are capable of interconnecting themselves in various virtual peer-to-peer topologies. IOS middleware agents: 1) profile application communication patterns; 2) evaluate the dynamics of the underlying physical resources; and 3) reconfigure application components by changing their mappings to physical resources through migration and by changing their granularity through a split and merge mechanism. A key characteristic of IOS is its decentralized coordination, thereby avoiding the use of global knowledge and thus enabling scalable reconfiguration. The IOS middleware is programming model-independent: we have implemented an actor programming model interface for SALSA programs and also a process programming model interface for MPI programs. Experimental results show that adaptive middleware can be an effective approach to reconfiguring distributed applications with various ratios of communication to computation in order to improve their performance, and more effectively utilize grid resources.}
}
-
James D. Teresco,
Joseph E. Flaherty,
Scott B. Baden,
Jamal Faik,
Sbastien Lacour,
Manish Parashar,
Valerie E. Taylor,
and Carlos A. Varela.
Parallel Processing for Scientific Computing,
chapter Approaches to Architecture-Aware Parallel Scientific Computation,
pages 33-58.
SIAM,
December 2006.
Keyword(s): distributed computing,
grid computing,
middleware,
scientific computing.
Abstract:
Modern parallel scientific computation is being performed in a wide variety of computational environments that include clusters, large-scale supercomputers, grid environments, and metacomputing environments. This presents challenges for application and library developers, who must develop architecture-aware software if they wish to utilize several computing platforms efficiently. Architecture-aware computation can be beneficial in single-processor environments. It takes the form of something as common as an optimizing compiler, which will optimize software for a target computer. Application and library developers may adjust data structures or memory management techniques to improve cache utilization on a particular system. |
@InBook{teresco06b,
author = {James D. Teresco and Joseph E. Flaherty and Scott B. Baden and Jamal Faik and S\'ebastien Lacour and Manish Parashar and Valerie E. Taylor and Carlos A. Varela},
editor = {Michael A. Heroux, Padma Raghavan, and Horst D. Simon},
title = {Parallel Processing for Scientific Computing},
chapter = {Approaches to Architecture-Aware Parallel Scientific Computation},
publisher = {SIAM},
year = 2006,
pages = {33--58},
month = {December},
pdf = {http://wcl.cs.rpi.edu/papers/pp04.pdf},
keywords = {distributed computing, grid computing, middleware, scientific computing},
abstract = {Modern parallel scientific computation is being performed in a wide variety of computational environments that include clusters, large-scale supercomputers, grid environments, and metacomputing environments. This presents challenges for application and library developers, who must develop architecture-aware software if they wish to utilize several computing platforms efficiently. Architecture-aware computation can be beneficial in single-processor environments. It takes the form of something as common as an optimizing compiler, which will optimize software for a target computer. Application and library developers may adjust data structures or memory management techniques to improve cache utilization on a particular system.}
}
-
Travis Desell,
Kaoutar El Maghraoui,
and Carlos A. Varela.
Malleable Components for Scalable High Performance Computing.
In Proceedings of the HPDC'15 Workshop on HPC Grid programming Environments and Components (HPC-GECO/CompFrame),
Paris, France,
pages 37-44,
June 2006.
IEEE Computer Society.
Note: Best paper award.
Keyword(s): distributed computing,
grid computing,
middleware.
Abstract:
This paper describes a modular decentralized middleware framework for dynamic application reconfiguration in large scale heterogeneous environments. Component malleability is presented as a dynamic reconfiguration strategy. Dynamic component granularity enables improved load balancing by component migration, and most importantly, it enables applications to scale up to arbitrarily large numbers of processing nodes in a relatively transparent way. Preliminary experimental results show that without significant overhead, malleable components can improve applications performance and distributed resource utilization. |
@InProceedings{desell-ios-hpcgeco06,
author = "Travis Desell and Kaoutar El Maghraoui and Carlos A. Varela",
title = {{Malleable Components for Scalable High Performance Computing }},
booktitle = "Proceedings of the HPDC'15 Workshop on HPC Grid programming Environments and Components (HPC-GECO/CompFrame)",
year = 2006,
publisher = "IEEE Computer Society",
pages = {37--44},
address = {Paris, France},
month = {June},
note = {Best paper award},
pdf = "http://wcl.cs.rpi.edu/papers/hpcgeco06.pdf",
keywords = "distributed computing, grid computing, middleware",
abstract = {This paper describes a modular decentralized middleware framework for dynamic application reconfiguration in large scale heterogeneous environments. Component malleability is presented as a dynamic reconfiguration strategy. Dynamic component granularity enables improved load balancing by component migration, and most importantly, it enables applications to scale up to arbitrarily large numbers of processing nodes in a relatively transparent way. Preliminary experimental results show that without significant overhead, malleable components can improve applications performance and distributed resource utilization.}
}
-
Kaoutar El Maghraoui,
Alok Meghranjani,
Tamar Eilam,
Michael H. Kalantar,
and Alexander V. Konstantinou.
Model Driven Provisioning: Bridging the Gap Between Declarative Object Models and Procedural Provisioning Tools..
In Proc. of Middleware 2006, ACM/IFIP/USENIX 7th International Middleware Conference, Melbourne, Australia, November 27-December 1,
Lecture Notes in Computer Science,
pages 404-423,
2006.
Springer.
Keyword(s): distributed computing,
middleware.
Abstract:
Today’s enterprise data centers support thousands of mission-critical business applications composed of multiple distributed heterogeneous components. Application components exhibit complex dependencies on the configuration of multiple data center network, middleware, and related application resources. Applications are also associated with extended life-cycles, migrating from development to testing, staging and production environments, with frequent roll-backs. Maintaining end-to-end data center operational integrity and quality requires careful planning of (1) application deployment design, (2) resource selection, (3) provisioning operation selection, parameterization and ordering, and (4) provisioning operation execution. Current data center management products are focused on workflow-based automation of the deployment processes. Workflows are of limited value because they hard-code many aspects of the process, and are thus sensitive to topology changes. An emerging and promising class of model-based tools is providing new methods for designing detailed deployment topologies based on a set of requirements and constraints. In this paper we describe an approach to bridging the gap between generated “desired state” models and the elemental procedural provisioning operations supported by data center resources. In our approach, we represent the current and desired state of the data center using object models. We use AI planning to automatically generate workflows that bring the data center from its current state to the desired state. We discuss our optimizations to Partial Order Planning algorithms for the provisioning domain. We validated our approach by developing and integrating a prototype with a state of the art provisioning product. We also present initial results of a performance study. |
@InProceedings{elmaghraoui-planning-middleware-2006,
author = {Kaoutar El Maghraoui and Alok Meghranjani and Tamar Eilam and Michael H. Kalantar and Alexander V. Konstantinou},
title = {Model Driven Provisioning: Bridging the Gap Between Declarative Object Models and Procedural Provisioning Tools.},
booktitle = {Proc. of Middleware 2006, ACM/IFIP/USENIX 7th International Middleware Conference, Melbourne, Australia, November 27-December 1},
year = {2006},
pages = {404-423},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
pdf = {http://wcl.cs.rpi.edu/papers/middleware06.pdf},
keywords = {distributed computing, middleware},
abstract = {Today’s enterprise data centers support thousands of mission-critical business applications composed of multiple distributed heterogeneous components. Application components exhibit complex dependencies on the configuration of multiple data center network, middleware, and related application resources. Applications are also associated with extended life-cycles, migrating from development to testing, staging and production environments, with frequent roll-backs. Maintaining end-to-end data center operational integrity and quality requires careful planning of (1) application deployment design, (2) resource selection, (3) provisioning operation selection, parameterization and ordering, and (4) provisioning operation execution. Current data center management products are focused on workflow-based automation of the deployment processes. Workflows are of limited value because they hard-code many aspects of the process, and are thus sensitive to topology changes. An emerging and promising class of model-based tools is providing new methods for designing detailed deployment topologies based on a set of requirements and constraints. In this paper we describe an approach to bridging the gap between generated “desired state” models and the elemental procedural provisioning operations supported by data center resources. In our approach, we represent the current and desired state of the data center using object models. We use AI planning to automatically generate workflows that bring the data center from its current state to the desired state. We discuss our optimizations to Partial Order Planning algorithms for the provisioning domain. We validated our approach by developing and integrating a prototype with a state of the art provisioning product. We also present initial results of a performance study.}
}
-
Wei-Jen Wang,
Kaoutar El Maghraoui,
John Cummings,
Jim Napolitano,
Boleslaw K. Szymanski,
and Carlos A. Varela.
A Middleware Framework for Maximum Likelihood Evaluation over Dynamic Grids.
In Second IEEE International Conference on e-Science and Grid Computing,
Amsterdam, Netherlands,
pages 8 pp,
December 2006.
Keyword(s): distributed computing,
grid computing,
middleware,
scientific computing.
Abstract:
We have designed a maximum likelihood fitter using the actor model to distribute the computation over a heterogeneous network. The prototype implementation uses the SALSA programming language and the Internet Operating System middleware. We have used our fitter to perform a partial wave analysis of particle physics data. Preliminary measurements have shown good performance and scalability. We expect our approach to be applicable to other scientific domains, such as biology and astronomy, where maximum likelihood evaluation is an important technique. We also expect our performance results to scale to Internet-wide runtime infrastructures, given the high adaptability of our software framework. |
@INPROCEEDINGS{wang-mle-06,
ADDRESS = {Amsterdam, Netherlands},
AUTHOR = {{Wei-Jen} Wang and Kaoutar El Maghraoui and John Cummings and Jim Napolitano and {Boleslaw K.} Szymanski and {Carlos A.} Varela},
BOOKTITLE = {Second IEEE International Conference on {e-Science} and Grid Computing},
MONTH = {December},
TITLE = {A Middleware Framework for Maximum Likelihood Evaluation over Dynamic Grids},
YEAR = {2006},
pages = {8 pp},
pdf = {http://wcl.cs.rpi.edu/papers/e-science2006.pdf},
keywords = {distributed computing, grid computing, middleware, scientific computing},
abstract = {We have designed a maximum likelihood fitter using the actor model to distribute the computation over a heterogeneous network. The prototype implementation uses the SALSA programming language and the Internet Operating System middleware. We have used our fitter to perform a partial wave analysis of particle physics data. Preliminary measurements have shown good performance and scalability. We expect our approach to be applicable to other scientific domains, such as biology and astronomy, where maximum likelihood evaluation is an important technique. We also expect our performance results to scale to Internet-wide runtime infrastructures, given the high adaptability of our software framework.}
}
-
Wei-Jen Wang and Carlos A. Varela.
Distributed Garbage Collection for Mobile Actor Systems: The Pseudo Root Approach.
In Proceedings of the First International Conference on Grid and Pervasive Computing (GPC 2006),
volume 3947 of Lecture Notes in Computer Science,
Taichung, Taiwan,
pages 360-372,
May 2006.
Springer.
Keyword(s): distributed computing,
grid computing,
internet programming languages.
Abstract:
Automatic distributed garbage collection (GC) gives abstraction to grid application development, promoting code quality and improving resource management. Unreachability of active objects or actors from the root set is not a sufficient condition to collect actor garbage, making passive object GC algorithms unsafe when directly used on actor systems. In practical actor languages, all actors have references to the root set since they can interact with users, e.g., through standard input or output streams. Based on this observation, we introduce pseudo roots: a dynamic set of actors that can be viewed as the root set. Pseudo roots use protected (undeletable) references to ensure that no actors are erroneously collected even with messages in transit. Following this idea, we introduce a new direction of actor GC, and demonstrate it by developing a distributed GC framework. The framework can thus be used for automatic life time management of mobile reactive processes with unordered asynchronous communication. |
@inproceedings{wang-varela-dgc-gpc-2006,
author = {{Wei-Jen} Wang and Carlos A. Varela},
title = {Distributed Garbage Collection for Mobile Actor Systems: The Pseudo Root Approach},
booktitle = "{Proceedings of the First International Conference on Grid and Pervasive Computing (GPC 2006)}",
year = {2006},
volume = {3947},
series = {Lecture Notes in Computer Science},
publisher = {Springer},
pages = {360-372},
address = {Taichung, Taiwan},
month = {May},
pdf = {http://wcl.cs.rpi.edu/papers/gpc2006.pdf},
keywords = {distributed computing, grid computing, internet programming languages},
abstract = {Automatic distributed garbage collection (GC) gives abstraction to grid application development, promoting code quality and improving resource management. Unreachability of active objects or actors from the root set is not a sufficient condition to collect actor garbage, making passive object GC algorithms unsafe when directly used on actor systems. In practical actor languages, all actors have references to the root set since they can interact with users, e.g., through standard input or output streams. Based on this observation, we introduce pseudo roots: a dynamic set of actors that can be viewed as the root set. Pseudo roots use protected (undeletable) references to ensure that no actors are erroneously collected even with messages in transit. Following this idea, we introduce a new direction of actor GC, and demonstrate it by developing a distributed GC framework. The framework can thus be used for automatic life time management of mobile reactive processes with unordered asynchronous communication.}
}
-
Wei-Jen Wang and Carlos A. Varela.
A Non-blocking Snapshot Algorithm for Distributed Garbage Collection of Mobile Active Objects.
Technical report 06-15,
Dept. of Computer Science, R.P.I.,
October 2006.
Keyword(s): distributed computing,
grid computing,
middleware.
Abstract:
Distributed actor garbage collection differs from distributed object garbage collection in that it needs to consider in-transit message detection, unordered message reception, and actor migration. In this paper, we propose a new snapshot-based distributed actor garbage collection algorithm. The algorithm does not require First-In-First-Out or blocking communication, nor message logging. Furthermore, actor migration is allowed while capturing global snapshots and partial snapshots can be safely used to collect garbage, therefore not requiring comprehensive cooperation among all computing nodes. These features make it unique in the area of distributed garbage collection. We formally prove the following safety and conditional liveness properties of the algorithm: 1) the garbage in a global snapshot, created by composing several local snapshots, remains the same from the beginning to the end of the global snapshot algorithm, and 2) garbage is eventually collected if the global garbage collection algorithm is periodically activated and every blocked actor is always captured before a global garbage collection phase is triggered. |
@techreport{wang-varela-snapshot-tr-2006,
author = {{Wei-Jen} Wang and Carlos A. Varela},
title = "A Non-blocking Snapshot Algorithm for Distributed Garbage Collection of Mobile Active Objects",
institution = "Dept. of Computer Science, R.P.I.",
number = "{06-15}",
month = oct,
year = 2006,
pdf = {http://www.cs.rpi.edu/research/pdf/06-15.pdf},
keywords = {distributed computing, grid computing, middleware},
abstract = {Distributed actor garbage collection differs from distributed object garbage collection in that it needs to consider in-transit message detection, unordered message reception, and actor migration. In this paper, we propose a new snapshot-based distributed actor garbage collection algorithm. The algorithm does not require First-In-First-Out or blocking communication, nor message logging. Furthermore, actor migration is allowed while capturing global snapshots and partial snapshots can be safely used to collect garbage, therefore not requiring comprehensive cooperation among all computing nodes. These features make it unique in the area of distributed garbage collection. We formally prove the following safety and conditional liveness properties of the algorithm: 1) the garbage in a global snapshot, created by composing several local snapshots, remains the same from the beginning to the end of the global snapshot algorithm, and 2) garbage is eventually collected if the global garbage collection algorithm is periodically activated and every blocked actor is always captured before a global garbage collection phase is triggered.}
}
-
Wei-Jen Wang and Carlos A. Varela.
Distributed Garbage Collection for Mobile Actor Systems: The Pseudo Root Approach.
Technical report 06-04,
Dept. of Computer Science, R.P.I.,
February 2006.
Note: Extended Version of GPC'06 Paper.
Keyword(s): distributed computing,
grid computing,
internet programming languages.
Abstract:
Automatic distributed garbage collection (GC) gives abstraction to grid application development, promoting code quality and improving resource management. Unreachability of active objects or actors from the root set is not a sufficient condition to collect actor garbage, making passive object GC algorithms unsafe when directly used on actor systems. In practical actor languages, all actors have references to the root set since they can interact with users, e.g., through standard input or output streams. Based on this observation, we introduce pseudo roots: a dynamic set of actors that can be viewed as the root set. Pseudo roots use protected (undeletable) references to ensure that no actors are erroneously collected even with messages in transit. Following this idea, we introduce a new direction of actor GC, and demonstrate it by developing a distributed GC framework. The framework can thus be used for automatic life time management of mobile reactive processes with unordered asynchronous communication. |
@techreport{wang-varela-dgc-tr-2006,
author = {{Wei-Jen} Wang and Carlos A. Varela},
title = "Distributed Garbage Collection for Mobile Actor Systems: The Pseudo Root Approach",
institution = "Dept. of Computer Science, R.P.I.",
number = "{06-04}",
month = feb,
year = 2006,
note = {Extended Version of GPC'06 Paper},
pdf = {http://wcl.cs.rpi.edu/papers/b2.pdf},
keywords = {distributed computing, grid computing, internet programming languages},
abstract = {Automatic distributed garbage collection (GC) gives abstraction to grid application development, promoting code quality and improving resource management. Unreachability of active objects or actors from the root set is not a sufficient condition to collect actor garbage, making passive object GC algorithms unsafe when directly used on actor systems. In practical actor languages, all actors have references to the root set since they can interact with users, e.g., through standard input or output streams. Based on this observation, we introduce pseudo roots: a dynamic set of actors that can be viewed as the root set. Pseudo roots use protected (undeletable) references to ensure that no actors are erroneously collected even with messages in transit. Following this idea, we introduce a new direction of actor GC, and demonstrate it by developing a distributed GC framework. The framework can thus be used for automatic life time management of mobile reactive processes with unordered asynchronous communication.}
}