Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers

The Transactions of the Korean Institute of Electrical Engineers

ISO Journal TitleTrans. Korean. Inst. Elect. Eng.

Main Menu

Journal Search

[

Research article

]

The Transactions of the Korean Institute of Electrical Engineers

KIEE Vol. 69, No. 11, p.1616-1625

ISSN (print) :

1975-8359

ISSN (online) :

2287-4364

Received : 17 October 2020Revised : 26 October 2020Accepted : 27 October 2020

DOI :

http://doi.org/10.5370/KIEE.2020.69.11.1616

Study on Missing PMU Data Recovery by Exploiting Low- Dimensional Hankel Structures-Experiments with KEPCO PMU Data Set

저 차원 Hankel 구조를 이용한 PMU 데이터 복구에 관한 연구

신정훈 (Jeonghoon Shin) ¹iD 남수철 (Suchul Nam) ¹iD EvangelousFarantatos (Evangelous Farantatos) ² MengWang (Meng Wang) ³ 성태응 (Tae-Eung Sung) ^†iD

(Korea Electric Power Corporation Research Institute, South Korea.)
(Electri Power Research Institute(EPRI), USA.)
(Dept. of Electrical, Computer&Systems Eng. RPI, USA.)

^†Corresponding Author : Div. of Computer and Telecommunications Eng. College of Science and Technology Yonsei University (Wonju Campus), South Korea.

E-mail : tesung@yonsei.ac.kr

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.kiee.or.kr).

Abstract

Phasor Measurement Units (PMUs) provide synchronized phasor measurements at much higher sampling rate than that in the traditional Supervisory Control And Data Acquisition (SCADA) system. Several synchrophasor-based algorithms and techniques have been and continue to be developed for real-time operation applications such as state estimation, stability analysis, disturbance detection, dynamic security assessment etc. However, synchrophasor data quality limits the incorporation of synchrophasor-based applications into control room operations environment and processes. The goal of this project is to develop methods that can improve synchrophasor data quality by recovering missing data reliably and efficiently. Data recovery refers to methods that estimate the values of missing data in the synchrophasor streams. Recently, modeless missing data recovery methods have been developed, that exploit the low-rank property of the spatial-temporal synchrophasor data blocks. A spatial-temporal synchrophasor data block can be considered as a matrix that is constructed by the measurements sampled at consecutive time instants with each row denoting the measurement of one certain channel across time. By exploiting the low-rank property of synchrophasor data matrices, the missing data recovery can be formulated as a low-rank matrix completion problem. The low-rank matrix completion problem has been extensively studied in the past few years and several algorithms have been developed to recover a low-rank matrix from partial observations. In this study, synchrophasor data analysis has been combined with low-rank matrix completion theory to develop a missing synchrophasor data recovery technique and tool.

Key words

Low-Dimensional Hankel Structure, Missing Data Recovery Method, Online Analytical Processing (OLAP), Phasor Measurement Units (PMU)

1. Introduction

This paper summarizes our recent progress on missing data recovery by exploiting low-dimensional structures in high-dimen- sional Phasor Measurement Unit (PMU) datasets. In previous study ⁽¹⁾ a missing data recovery method has been developed, referred to as OLAP(Online Analytical Processing), to reconstruct missing PMU data and obtain promising results on actual PMU datasets. This study extends our previous efforts in this line of research. Specially, data recovery under extreme conditions is considered, e.g., almost all the measurements in all PMU channels are lost simultaneously. Existing low-rank-based methods cannot handle the pattern of simultaneous data loss. It is demon- strated that not only the matrix of spatial-temporal PMU data is low rank, but also the Hankel matrix of the PMU data is low rank. This property of low-rank Hankel matrix results from the power system dynamics and does not hold for general low-rank matrices. The central idea of this study is to further exploit the low-rank property of the Hankel matrix to recover missing points under extreme conditions and correspondingly modify and enhance the existing OLAP method, resulting in a new method named as OLAP-H. OLAP-H has been tested numerically using a KEPCO PMU dataset. In order to simulate extreme conditions, four models of missing data patterns are proposed and the performance of OLAP-H with OLAP was compared. Numerical experiments demonstrate that the recovery accuracy is improved when OLAP-H is employed. This paper is structured as follows. Section 2 describes the low-rank property of the spatial-temporal blocks of PMU data. The low-rank property is the key to missing data recovery methods. Section 3 presents the research motivation. Section 4 describes the Hankel matrix and its low-rank property. Section 5 presents the developed method OLAP-H to conduct online missing PMU data recovery. Section 6 records the simulation results with different missing data modes. Conclusions are provided in Section 7.

2. Low-rank property of PMU data

The PMU data used in this paper are from five 345 KV substations recorded by KEPCO(Fig. 1). The recording rate of PMUs is 60 samples per second. With the provided measurements during 10 minutes, we conduct pre-processing and obtain the measurements during the common period, which starts from 2016-02-27 05:15:30:967 and ends at 05:25:29:983. The data of SMITH is excluded in the analysis because it is misaligned with other PMUs in time synchronization. Fig. 1shows the geographical location of six substations at which the selected PMUs are installed in KEPCO system and demonstrates the recorded frequencies of all six substations. The data of SMITH leads the data of the remaining five substations about 24s, while they share the same time tags in the provided data files. One can also see that from the voltage magnitudes in Fig. 2. Clearly, SMITH does not record the disturbance because of the time differences. Therefore, the data of substation SMITH are excluded in the remaining part of the paper.

Fig. 1. PMU locations and record frequencies of six substations

The data of voltage angles of the remaining five substations are shown in Fig. 3. Since the event occurs at about 25s from the beginning of 05:15:30:967, in this paper, we focus on the data from the beginning to 40s that contains the event.

Fig. 2. The recorded voltage magnitudes

Fig. 3. The recorded voltage angles

Let a 5 by 2400 complex matrix M contain the PMU data in the first 40 seconds. Each row corresponds to a sequence of voltage phasor measurements of one substation. Each column corresponds to the PMU measurements at the same sampling instant. Fig. 4shows the singular values of . The five singular values are 38241.1, 50.0, 31.7, 10.6, 4.1.

Fig. 4. The distribution of singular values

If one wants to approximate matrix with a rank-r matrix with a minimum approximation error, the problem can be formulated as:

$$ \begin{array}{c} \min _{\hat{M}}\|M-\hat{M}\|_{F} \\ \text {s.t. } \operatorname{rank}(\hat{M})=r \end{array} $$ Let denote singular value decomposition of matrix M, where denotes the conjugate transpose of complex matrix V. Let denote the r dominant left singular vectors. Let denote the diagonal matrix with only the r dominant singular values in the diagonal. contains the r dominant right singular vectors. Then, the solution to the above optimization problem is . The approximation error is defined as follows:

$$ e_{a}=\frac{\|M-\hat{M}\|_{F}^{2}}{\| M_{F}^{2}} $$ Based on the analysis, it was found that the KEPCO data matrix M can be approximated by a rank-1 matrix with an approximation error less than 0.01%. Thus M can be approximated well with a rank-1 matrix. The low-dimensionality of PMU measurements can be employed to conduct PMU data compression ⁽²⁾, system event detection ^(3,⁶⁾, missing data recovery ^(1,⁵⁾, cyber data attack detection ⁽⁴⁾, etc. This study focuses on online missing data recovery. Examples of existing methods on online subspace tracking and missing data recovery include Parallel Estimation and Tracking by Recursive Least Squares (PETRELS)⁽²⁾ and Grassmannian Rank-One Update Subspace Estimation (GROUSE)⁽³⁾. These two methods assume the dimension of the subspace is known and fixed. The dimension of the subspace, however, usually changes when a disturbance happens in power systems. In past study an online algorithm for PMU data processing (OLAP) ⁽¹⁾ was proposed to track the underlying subspace with varying dimensionalities and fill in the missing observations.

3. Limitations of existing low-rank methods in PMU data recovery

One major advantage of low-rank-based methods is that they are data oriented and do not require the modeling of power systems. One limitation of these methods is that they degrade significantly under extreme conditions. For example, if almost all the PMU measurements are lost simultaneously at a given time instant, existing methods would either fail to recover or estimate the missing points by previous measurements. Moreover, because the measurements are usually noisy, the recovery error generally increases when the noise increases. In addition, due to a lack of regularization terms and the influence of the measurement noise, the recovered data always contains spikes, which can be shown in Fig. 5, where the data are erased randomly and the data loss percentage is 10%. Building on the above analysis, we propose to exploit the Hankel structure to conduct online missing data recovery so that missing points can be recovered accurately under extreme conditions. The proposed low-rank-based methods have been also enhanced by adding a low-pass filter to reduce the impact of noise on the data recovery.

Fig. 5. Example of recovered data with OLAP algorithm. The recovered data contain some spikes due to the measurements noise.

4. Hankel matrix and the low-rank property

Let a vector in denote the measurements at instant, where m is the number of PMU measurement channels. The measurement matrix from time 1 to is represented by

$$ Y(1, t)=\left[\begin{array}{llll} y_{1} & y_{2} & \cdots & y_{t} \end{array}\right] $$ The rows and the columns of the measurement matrix are related the circuit laws and the dynamics of power systems, as is shown in Fig. 6.

Fig. 6. PMU data matrix

Hankel matrices are usually employed to analyze the order and parameters of linear systems based on the obtained inputs and outputs. A Hankel matrix is defined as

$$ Y_{H}^{j}(1, t)=\left[\begin{array}{ccc} y_{1} & y_{2} & \cdots & y_{t-j+1} \\ y_{2} & y_{3} & \cdots & y_{t-j+2} \\ \vdots & \vdots & \ddots & \vdots \\ y_{j} & y_{j+1} & \cdots & y_{t} \end{array}\right] $$ where j denotes the number of observation vectors in each column of Hankel matrix. The size of the constructed Hankel matrix is . Using the KEPCO measurements during the first 40 seconds, the corresponding Hankel matrices were constructed and the approximation errors were computed with low-rank matrices, as is shown in Fig. 7. It can be observed that the constructed Hankel matrices are still low-rank with different choices of j. Note that when , the size of the constructed Hankel matrix almost increases linearly with j. Fig. 8shows the approximation errors if is approximated with a matrix of rank jr, where r is the rank of the approximated matrix to . A decreasing trend in approximation error with the increase of j can be observed in Fig. 8. That means the approximate rank does not increase proportionally with the size of Hankel matrix of the PMU data, which in turn demonstrates that the Hankel structure captures the additional dynamic information in the datasets compared with the original data matrix.

Fig. 7. Low-rank property of the Hankel matrix

It should be noted that the low-rank property of the Hankel matrix results from the dynamics of power systems. This property does not hold for general low-rank matrices. In order to further illustrate this concept, the columns of are randomly permutated, and let denote the matrix after permutation. One can easily check that has the same rank as and thus, is still low-rank. However, since the columns in do not correspond to consecutive measure- ments, the corresponding Hankel matrix may no longer be low-rank, as is shown in Fig. 9. From the results shown in Fig. 7and Fig. 9, it can be observed that a rank-k approximation matrix to that always achieves a smaller error. For instance, if k=3 and j=5, the approximation error to is while the approximation error to is 0.233. Existing recovery methods are based on the low-rank property of the original observation matrix. The novel idea of this study is to further exploit the low-rank property of the Hankel matrix to improve the accuracy of data recovery. Numerical experiments using the KEPCO data show that the new method performs well under extreme conditions.

Fig. 8. Approximation errors of Hankel matrices with a rank proportional to its size

Fig. 9. Low-rank matrix approximation of the Hankel matrix with columns randomly permutated in the measurement matrix

5. OLAP-H: an online missing data recovery method

The procedures of the original OLAP are summarized as follows:

1)With the past L output vectors at current instant t, conduct SVD, determine the approximate rank r and obtain the r dominant left singular vectors .

2)Receive new data with missing points, compute , where denotes the support set of the observed entries in .

3)Fill in the missing entries of with , where denotes the complement set of . Next, OLAP-H is described to conduct online missing data recovery based on the low-rank property of the Hankel matrix. The developed method has advantages over OLAP in the following aspects:

a)It can fill in the missing entries with a higher accuracy under extreme conditions by exploiting the low-rank property of the Hankel matrix;

b)The algorithm reduces the influence of measurement noise on missing data recovery performance by (1) adding one regularization term and (2) assigning different weights to PMU channels with different noise levels when determining the coefficients of the method.

The procedures of the developed algorithm are as follows:

Input: Approximation error threshold e_a, coefficient $\lambda \in[0,1]$, number of observation vectors in each column of Hankel matrix $j$, window length $L$.

For $t=1,2,3, \cdots$ do

1.Construct Hankel matrix $Y_{H}^{j}(t-L, t-1)$ from the past L observation vectors, conduct SVD on $Y_{H}^{j}(t-L, t-1)$, $$ Y_{H}^{j}(t-L, t-1)=U \Sigma V^{*}, $$ where $x^{*}$ denotes the conjugate transpose of complex vector or matrix $x$.

2. Find the smallest r satisfying $$ \frac{\|\Sigma-\Sigma\|_{F}^{2}}{\|\Sigma\|_{F}^{2}} \leq e_{a}, $$ then determine the corresponding r dominant left singular vectors $U^{r}$.

3. Receive new data $y_{t} \in C^{m \times 1}$ with erasures, construct a new column of Hankel matrix $\beta_{H}^{t}=\left[\begin{array}{lll}y_{t-j+1}^{*} & y_{t-j+2}^{*} & \cdots y_{t}^{*}\end{array}\right]^{*}$, the support set of observed entries in $\beta_{H}^{t}$ is denoted as $\Psi_{t}$.

4. Compute

(1)

$$ \tilde{v}=\operatorname{argmin}_{v}\left\|P_{W_{t}} w\left(\beta_{H}^{t}-U^{r} v\right)\right\|_{F}^{2}+\lambda\left\|P_{U_{t-1}} w\left(\beta_{H}^{t-1}-U^{r} v\right)\right\|_{F}^{2} $$,

where represents the assigned weight for each PMU, $w$ is smaller for a PMU with a higher noise level; $P_{\Psi_{t}}$ is a diagonal matrix, if, $P_{\Psi_{t}}(i, i)=1$ otherwise $P_{\Psi_{t}}(i, i)=0$.

5. Fill in the missing entries of $y_{t}$ with the corresponding entries in $U^{r} \tilde{v}$.

End for

Note that if the rank of underlying subspace is assumed to be fixed as r and known, then step 1 and step 2 can be simplified to conduct SVD (Singular Value Decomposition) on $Y_{H}^{j}(t-L, t-1)$ to obtain the $r$ dominant left singular vectors $U^{r}$.

In practice, $w$ can be estimated as $$ w=\left[\begin{array}{cccc} \sigma_{1}^{-1} & 0 & \cdots & 0 \\ 0 & \sigma_{2}^{-1} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \sigma_{m}^{-1} \end{array}\right], $$ where $\sigma_{i}$ denotes the standard variation of the recorded voltage magnitude of PMU i when no event occurs. In this case, the measurements with higher levels of noise are assigned to smaller weights. That allows a larger gap between the observed data and the recovered data, so as to reduce the impact of noise on determining the coefficient $\tilde{v}$ and the recovered data $U^{r} \tilde{v}$.

It is noticed that due to the high-rate sampling of PMUs, the measurement at time $t$ is not far away from that at $t$-1. In addition, when applying the original OLAP algorithm to recover the missing data, there may exist spikes in the recovered data. In step 4, the first term $\left|P_{\bar{N}_{\mathbf{e}}} w\left(\beta_{H}^{t}-U^{r} v\right)\right|_{F}$ and the second term $\left|P_{\Psi_{t-1}} w\left(\beta_{H}^{t-1}-U^{\tau} v\right)\right|_{F}$ are used to evaluate the distance of the recovered data $U^{r} v$ to the observed data in $\beta_{H}^{t}$ and $\beta_{H}^{t-1}$, respectively. Taking both terms into account, the recovered data $U^{r} v$ are required not only to fit well with the observed data in $\beta_{H}^{t}$ but also to stay close to $\beta_{H}^{t-1}$. In this case, the second term in (1) serves as a low-pass filter with the aim to smooth the recovered data.

6. Simulation Results

6.1 Recovery error

The relative error of one unobserved entry is defined as $$ e_{t}(i)=\frac{\left|\hat{y}_{t}(i)-y_{t}(i)\right|}{|y(i)|}, $$ where $\hat{y}_{t}(i)$ is the recovered data of PMU i at time instant t, $y_{t}(i)$ is the actual data, $\overline{|y(i)|}$ is the mean value of the measured voltage magnitude by PMU i, which can be deter- mined from historical data.

The average recovery error then is defined as $$ \bar{e}=\frac{1}{\left|\Psi_{T}\right|} \sum_{t \in \Psi_{T}}\left(\frac{1}{\left|\Phi_{t}^{c}\right|} \sum_{i \in \Psi_{t}^{c}} e_{t}(i)\right), $$ where denotes the support set of unobserved entries in the measurement vector at time t, denotes the support set of time instants when the observation is incomplete, is the cardinality of one set. Since all the unobserved entries during time 1 to T are considered, is the average recovery error for each unobserved entry. A smaller indicates a higher accuracy in recovering the missing data.

The maximum recovery error is defined as $$ e_{m}=\max _{t \in \Psi_{T}}\left(\max _{i \in \Psi_{t}^{\varepsilon}} e_{t}(i)\right). $$ $e_m$ is the maximum relative recovery error among all the missing entries from time 1 to $T$. indicates the recovery performance in the worst case.

6.2 Simulated data loss

Since the motivation of this study is to address the limitations of existing algorithms, including their degraded performance in extreme conditions, four modes of data loss are proposed to simulate the extreme conditions.

Mode 1: Data losses happen at random locations.

Mode2:Data losses happen at randomly selected time instants. At a time instant that data losses happen, only r out of m measurements are obtained, where m is the total number of PMU channels. The locations of obtained measurements are selected randomly.

Mode 3: Data losses happen at consecutive time instants. At a time instant that data losses happen, only r out of m measurements are obtained, and the locations of obtained measurements are selected randomly.

Mode 4: Data losses happen at consecutive time instants. Only r out of m measurements are obtained at each time instant. Moreover, these obtained measurements are from the same set of PMUs. One diagram to demonstrate the four modes is shown in Fig. 10, where r=2 in mode 2, 3 and 4.

Fig. 10. Diagram of four modes of missing entries

One example of the chosen unobserved data from the KEPCO PMU data is demonstrated in Fig. 11, where the data loss percentage is 30%, and r=2.

Though it is difficult to distinguish the difference between mode 1 and mode 2, the number of unobserved entries in mode 1 is not fixed for a vector with missing entries, while only 2 entries are observed in mode 2 if there exist missing entries in one measurement vector.

Fig. 11. Example of missing data in four modes when 30% of the data points are lost

In the simulation part, the parameter j was first set as 6, which means each column of the Hankel matrix consists of 6 observation vectors. Window length was set L=15. In this case, any Hankel matrix can be approximated by a rank-2 matrix with an error less than 0.1%. Thus the dimensions of the underlying subspaces of the observation matrix Y and the Hankel matrix were fixed both as 2. The mean values of voltage magnitudes and standard deviation of each PMU are determined from the first 600 samples of voltage magnitudes, which correspond to the period of no events. was varied from 0 to 0.05 and the data loss percentage from 10% to 50%. The columns with missing entries are chosen randomly for each test (For mode 3 and mode 4, the starting points of the data losses are selected randomly). The number of measurement vectors with missing entries is related to the data loss percentage. The average recovery error is averaged over 100 tests, and the maximum recovery error is the largest one among the obtained maximum recovery errors of the 100 tests.

6.2.1 Influence of parameter

Fig. 12shows the influence of parameter on the average recovery error. One can see that a larger results in larger average errors on all four modes. Part of the reason is that in order to determine the coefficient v in step 4 of OLAP-H, is taken into account with the aim to smooth the recovered data. A larger indicates more emphasis is placed on restricting the recovered data to be close to, thus the recovered data might not be able to track a significant change in the measurements due to an event. In this case, we choose close to 0. Fig. 13demonstrates the influence of on the maximum recovery error. It can be seen that in mode 4, still corresponds to the best recovery performance. In the remaining three modes, a small can help to decrease the maximum recovery error, especially when the fraction of unobserved data is large. To summarize, the introduction of is to smooth the recovered data and reduce the spikes in the recovered data. In the first three modes, a small can help to reduce the maximum recovery error at the expense of increasing the average error slightly. In mode 4, corresponds to the best performance in the aspects of both the average and the maximum recovery errors. Thus is suggested as between 0 to 0.02 when the actual erasure pattern is a mixture of all four modes.

Fig. 12. The average recovery errors in four modes

Fig. 13. The maximum recovery errors in four modes

6.2.2 Influence of parameter w

As analyzed in Sec. 4, the role of parameter w is to reduce the influence of measurement noise on the data recovery performance in order to mitigate the spikes in the recovered data. A noisier channel should correspond to a smaller coefficient. One example to demonstrate this parameter’s influence is shown in Fig. 14, where the observed data is the same as the one shown in Fig. 5. The recovered data is obtained with OLAP-H, where . Under this setting, the method OLAP-H differs from OLAP by taking w into account.

Fig. 14. The recovered data with w taken into account

Fig. 15. The recovered data from Hankel matrix

Compared to the recovered data in Fig. 5, there are fewer spikes in Fig. 14, and the magnitudes of those spikes are not as large as the ones shown in Fig. 5. The data recovered from OLAP-H with j=6 and is presented in Fig. 15, where no obvious spikes exist.

6.2.3 Influence of parameter j

In this subsection, is fixed as 0.01 in the first three modes and as 0 in mode 4. Parameter j varies from 2 to 6. Fig. 16shows the average recovery errors with varying j, and Fig. 17demonstrates the corresponding maximum recovery errors.

Fig. 16. The average recovery errors with varying j in modes 1 to 3

Fig. 17. The maximum recovery errors with varying j in modes 1 to 3

It is observed that as j increases, which means there are more measurement vectors in each column of the constructed Hankel matrix, both the average recovery error and the maximum recovery error decrease. In this case, it is suggested to choose a large parameter j. Note that with a fixed window length of L, i.e. the number of historical measurement vectors is fixed, the size of the constructed Hankel matrix is . Thus when , the computational complexity to recover the missing data increases as j becomes larger. In this case, we suggest choosing j between 3 and 6. In mode 4, different from the results in modes 1 to 3, both the average recovery error and the maximum error decrease as the parameter j increases from 2 to 6. The result is shown in Fig. 18. In this case, in order to achieve the best performance, it is suggested to choose j as 2 or 3.

Fig. 18. The recovery errors with varying j in mode 4

In the practical implementation, since it is difficult to determine which mode the missing data follows, given the influence of parameters j and on the recovery performance of all the cases, is suggested as 0.01 and j as 3 or 4.

6.2.4 Comparison with the recovery from the original matrix

For the original measurement matrix, which can be regarded as a Hankel matrix with, OLAP-H is applied to recover the missing data as well. The parameter ’s influence on the average and the maximum recovery errors is studied, and a proper is chosen as around 0.05 for all the four modes. The recovery performances were compared between the original measurement matrix and the Hankel matrix. Fig. 19presents the average recovery errors in all the four modes with j=3 in the Hankel matrix. Fig. 20shows the maximum recovery errors in all the modes. The recovery performance by duplication from the last observation vector is included as the reference in mode 1 to 3. In mode 4, the recovery error by duplication from the last observation vector is much larger than the recovery errors of OLAP and OLAP-H, thus we do not include it.

Fig. 19. The average recovery errors in the four modes with j=3

Fig. 20. The maximum recovery errors in the four modes with j=3

In all the modes the recovery from Hankel matrix always achieves the highest accuracy. In addition, the maximum errors from the measurement matrix in mode 1 and 2 are very large, especially when the loss percentage is over 30%. Part of the reason is that in mode 1, the data losses happen at random locations. Thus with a higher loss percentage, it is very likely that at most 1 out of 5 entries in some measurement vector is observed at some time instant. In this case, OLAP fails to recover the missing data, since the coefficient vector v cannot be determined properly in step (4). Data recovery under this extreme data loss pattern is, in fact, the motivation of designing OLAP-H.

Another reason is that the recovered data at one time instant will affect the obtained basis of the underlying subspace and hence affect the recovery on the following measurement vectors. At a higher data loss rate, the recovery error could be gradually accumulated.

7. Conclusions

In this study, the limitations of existing modeless data recovery methods were analyzed, including their degraded performance in missing data recovery under extreme conditions, and the influence of measurement noise on the recovered data was discussed. In order to handle these limitations and reduce the impact of measurement noise, the low-rank property of the Hankel matrix was studied and an improved algorithm, OLAP-H, was developed to recover missing data in real-time. To summarize, the following accomplishments have been achieved:

1.The low-rank properties of original measurement matrix and the constructed Hankel matrix are studied with the KEPCO PMU dataset.

2.A new method OLAP-H is developed based on the original OLAP method to exploit the low-rank property of Hankel matrix to recover the missing data. In order to reduce the influence of noise on the recovered data, PMU channels with higher noise levels were penalized in the data recovery. In addition, the recovered data were smoothened by adding a regularization term of the change in the measurements.

3.In order to simulate the extreme conditions where the performance of existing methods degrades significantly, four modes of data loss were considered. The developed algorithm is applied to fill in unobserved data in all four modes. Numerical experiments show that the recovery performance on the Hankel matrix is much better than the performance on the original measurement matrix.

References

P. Gao, M. Wang, S. G. Ghiocel, J. H. Chow, B. Fardanesh, 2016, Stefopoulos. Missing data recovery by exploiting low-dimensionality in power system synchrophasor measure- ments, IEEE Trans. Power Syst., Vol. 31, No. 2, pp. 1006-1013

N. Dahal, R. L. King, 2012, Online dimension reduction of synchrophasor data, pp. 1-7

Y. Chen, L. Xie, 2013, Dimensionality reduction and early event detection using online synchrophasor data, pp. 1-5

P. Gao, M. Wang, J. H. Chow, S. G. Ghiocel, B. Fardanesh, G. Stefopoulos, Identification of successive unobservable cyber data attacks in power systems through matrix decomposition, IEEE Trans. Signal Process. Accepted

Y. Chi, Y. C. Eldar, 2013, Parallel subspace estimation and tracking by recursive least squares from partial observations, IEEE Trans. Signal Process., Vol. 61, No. 23, pp. 5947-5959

L. Balzano, R. Nowak, B. Recht, 2010, Online identification and tracking of subspaces from highly incomplete infor- mation, in Proc. Allerton Conf. Commun. control comput., pp. 704-711

저자소개

Jeonghoon Shin

He received the B.S., M.S., and Ph.D. degrees in electrical engineering from Kyungpook National University, Daegu, South Korea, in 1993, 1995, and 2006, respectively.

Now, in 2020, he is also in the doctoral course of graduate school technology policy in Yonsei University.

Since 1995, he has been with Korea Electric Power Corporation Research Institute (KEPRI), the research institute of Korea Electric Power Corporation.

He is currently a Chief Researcher and leads the Power System Group in power system laboratory, KEPRI.

From March 2003 to February 2004, he was a Visiting Scholar with Electric Power Research Institute, Palo Alto, CA, USA,.

His research interests include wide area monitoring, protection and control systems based on synchro-phasor data, hierarchical voltage controls, real-time digital simulations, and transient/dynamic stability studies.

Tel: 042-865-5810, Fax: 042-865-5829

E-mail : Jeonghoon.shin@kepco.co.kr

Suchul Nam

He received the M.S. degree in electrical engineering from Korea University, Seoul, South Korea.

He has been with Korea Electric Power Corporation Research Institute (KEPRI), the research institute of Korea Electric Power Corporation.

He is currently a senior Researcher in Power System Group, KEPRI.

Tel: 042-865-5823, Fax: 042-865-5829

E-mail : suchul.nam@kepco.co.kr

Evangelous Farantatos

He received the Diploma in Electrical and Computer Engineering from the National Tech- nical University of Athens, Greece, in 2006 and the M.S. and Ph.D. degrees from the Georgia Institute of Technology, Atlanta, GA, USA, in 2009 and 2012, respectively.

He is a Senior Project Manager with the Grid Operations and Planning R&D Group at EPRI, Palo Alto, CA.

He is managing and leading the technical work of various R&D projects related to syn- chrophasor technology, power systems moni- toring and control, power systems stability and dynamics, renewable energy resources modeling, grid operation with high levels of inverter- based resources and system protection.

He is a Senior Member of IEEE.

In summer 2009, he was an intern at MISO.

E-mail : efarantatos@epri.com

Meng Wang

She received B.S. and M.S. degrees from Tsinghua University, China, in 2005 and 2007, respectively.

She received the Ph.D. degree from Cornell University, Ithaca, NY, USA, in 2012.

She is an Assistant Professor in the department of Electrical, Computer, and Systems Engineering at Rensselaer Polytechnic Institute, Troy, NY, USA.

Her research interests include high-dimensional data analytics, machine lear- ning, power systems monitoring, and syn- chrophasor technologies.

E-mail : wangm7@rpi.edu

Tae-Eung Sung

He is an associate professor in the Division of Computer and Telecommunications Engi- neering, Yonsei University, Wonju, Republic of Korea.

He received his B.S. in electrical engineering from Seoul National University in South Korea, and his M.S and Ph.D. both in electrical and computer engineering, from the University of Texas at Austin and Cornell University in the United States, respectively.

His research interests embrace, but are not limited to, machine learning and deep learning in data-driven information systems, predictive maintenance in industrial-IoT factory data, datamining, technology and patent valuation, including communications system and wireless sensor networks, stationary and non-stationary signal processing.

Tel: 033-760-2393

E-mail : tesung@yonsei.ac.kr

KIEEThe Transactions of
the Korean Institute of Electrical Engineers

The Transactions of the Korean Institute of Electrical Engineers

ISO Journal TitleTrans. Korean. Inst. Elect. Eng.

Journal Search

Journal XML

Journal Information

저 차원 Hankel 구조를 이용한 PMU 데이터 복구에 관한 연구

Abstract

Key words

1. Introduction

2. Low-rank property of PMU data

3. Limitations of existing low-rank methods in PMU data recovery

4. Hankel matrix and the low-rank property

5. OLAP-H: an online missing data recovery method

(1)

6. Simulation Results

6.1 Recovery error

6.2 Simulated data loss

6.2.1 Influence of parameter

6.2.2 Influence of parameter w

6.2.3 Influence of parameter j

6.2.4 Comparison with the recovery from the original matrix

7. Conclusions

References

저자소개

Jeonghoon Shin

Suchul Nam

Evangelous Farantatos

Meng Wang

Tae-Eung Sung

Article Information (continued)

Key words

KIEEThe Transactions ofthe Korean Institute of Electrical Engineers

The Transactions of the Korean Institute of Electrical Engineers

ISO Journal TitleTrans. Korean. Inst. Elect. Eng.

Journal Search

Journal XML

Journal Information

저 차원 Hankel 구조를 이용한 PMU 데이터 복구에 관한 연구

Abstract

Key words

1. Introduction

2. Low-rank property of PMU data

3. Limitations of existing low-rank methods in PMU data recovery

4. Hankel matrix and the low-rank property

5. OLAP-H: an online missing data recovery method

(1)

6. Simulation Results

6.1 Recovery error

6.2 Simulated data loss

6.2.1 Influence of parameter

6.2.2 Influence of parameter w

6.2.3 Influence of parameter j

6.2.4 Comparison with the recovery from the original matrix

7. Conclusions

References

저자소개

Jeonghoon Shin

Suchul Nam

Evangelous Farantatos

Meng Wang

Tae-Eung Sung

Article Information (continued)

Key words

KIEEThe Transactions of
the Korean Institute of Electrical Engineers