Date of Award

2020-01-01

Degree Name

Doctor of Philosophy

Department

Computer Science

Advisor(s)

Christopher Kiekintveld

Abstract

With increasing energy demand and an intermittent supply of renewable energy sources, our current energy grid needs a transformation towards a more robust, reliable energy trading architecture. The smart grid promises this architecture as the future of the present energy market, where traders will use digital technologies to automate the management of power delivery. It will improve many issues of the current energy grid such as sustainable, clean, renewable, reliable and secure energy supply, customer participation in markets, distributed generation, and transparency in energy trading. Using autonomous trading agents, we can bridge several dynamic energy markets and ensure an efficient and robust trading environment for all the players in the smart grid. The Power Trading Agent Competition (Power TAC) simulation emphasizes the strategic problems that autonomous trading agents, i.e., brokers, will face in managing the economics of a smart grid.

In Power TAC, brokers make trades in multiple parallel markets such as wholesale, tariff, and balancing markets to supply energy from producers to consumers. To be successful, brokers must make reasonable predictions about future supply, demand, and prices in the wholesale and tariff markets to make trading decisions by maintaining a favorable energy imbalance in the balancing market.

Market clearing price prediction is an integral part of the broker's wholesale market strategy because it helps the broker to make intelligent decisions for purchasing energy at low cost in a day-ahead wholesale market. People use machine learning methods to predict prices in the Power TAC wholesale Periodic Double Auction (PDA) market, where the brokers can take advantage of the price predictor to design bidding strategies. PDAs are commonly used in real-world energy markets to trade energy in specific time slots to balance demand on the power grid where multiple discrete trading periods are specified for a single type of good. Strategically, bidding in a PDA is complicated because the bidder must predict and plan for future auctions that may influence the bidding strategy for the current auction.

In our work, we use the RepTree model to predict prices and present a general bidding strategy for PDAs. Our wholesale market strategy uses forecasted clearing prices and Monte Carlo Tree Search (MCTS) to plan a bidding approach across multiple time-periods. Additionally, we present a fast heuristic policy that can be used either as a standalone method or as a seeding technique to initialize the search space of the MCTS bidding strategy. We evaluate our bidding strategies using a controlled PDA simulator based on the wholesale market implemented in the PowerTAC competition. We demonstrate that our strategies outperform state-of-the-art champion bidding strategies designed for that competition.

In the retail market, a broker makes sequential decisions simultaneously with other brokers to buy and sell energy through publishing tariffs where a tariff is a contract between a broker and a customer. To be as profitable as possible, a broker needs an effective energy selling retail strategy. Our work includes developing an isolated miniature retail market simulator to control the dynamic and stochastic variables of the vibrant, complex retail market so that we can understand the basic features and strategic dynamics among retail trading strategies. We apply deep reinforcement learning (DQN) to learn the best response (BR) strategy for a specific strategy played in this simulator. Using DQN as a best response learning technique, we propose ``Clustered Double Oracle Empirical Game-Theoretic Analysis"(CDO-EGTA), a novel method for minimizing regret (i.e., maximizing revenues) in retail trading. CDO-EGTA method clusters the existing pool of strategies into some groups, learns a new BR strategy for each of the groups using the Double Oracle Empirical Game-Theoretic Analysis method, and outputs a class of BR strategies to play the game. Empirical results show that our method outperforms the existing methods in regret comparison.

Language

en

Provenance

Received from ProQuest

File Size

116 pages

File Format

application/pdf

Rights Holder

Moinul Morshed Porag Chowdhury

Share

COinS