CN112752266B

CN112752266B - Joint spectrum access and power control method in D2D haptic communication

Info

Publication number: CN112752266B
Application number: CN202011587066.7A
Authority: CN
Inventors: 吴丹; 吴岩; 刘杰; 乐超; 管新荣
Original assignee: Army Engineering University of PLA
Current assignee: Army Engineering University of PLA
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-05-24
Anticipated expiration: 2040-12-28
Also published as: CN112752266A

Abstract

A joint spectrum access and power control method in D2D haptic communication relates to the technical field of wireless communication. The invention randomly accesses a resource block to each D2D pair according to a frequency spectrum access mechanism based on competition; calculating the packet error rate of a D2D forward link under a competitive access mechanism, namely from a source end to a destination end; calculating the packet error rate of a D2D reverse link under a competitive access mechanism, namely from a destination end to a source end; calculating the packet error rate of the D2D haptic communication closed loop; establishing an optimization problem, namely minimizing the total packet error rate of the system; modeling an optimization problem as a game; and solving the game Nash equilibrium by adopting a synchronous logarithmic linear learning algorithm. Interference among users can be coordinated through reasonable power control, resource scheduling delay can be effectively unloaded, and the error probability of data packet transmission is reduced, so that the delay and reliability requirements of D2D tactile communication are met.

Description

Joint spectrum access and power control method in D2D haptic communication

Technical Field

The invention relates to the technical field of wireless communication, in particular to a combined spectrum access and power control method for D2D haptic communication.

Background

With the development of wireless communication technology, people no longer satisfy traditional audio-video communication, but hope to obtain more immersive user experience. Based on this, haptic communication technology has been developed. The haptic communication mainly transmits human haptic signals (such as strength, torque, speed, and the like), and can provide multi-dimensional sensing information for a user. Therefore, the haptic communication has a wide application prospect, such as virtual reality, automatic driving, smart medical treatment, and the like.

The haptic communication can be divided into remote haptic communication and local haptic communication according to the distance between the control terminal and the operation terminal. For tele-tactile communication, it is necessary to transmit signals by means of a network infrastructure (network domain); for local tactile communication, the control end and the operation end can directly communicate in a device-to-device (D2D) mode, so that the spectrum utilization efficiency is effectively improved, and the transmission delay is reduced.

However, implementing D2D haptic communication is more challenging than traditional D2D audiovisual communication. Specifically, the audio/video stream has a high throughput requirement, so that only rich bandwidth resources are allocated, and the audio/video stream is generally unidirectional. The difference is that the transmission of the haptic flow is bidirectional transmission, and the transmission comprises that the control end sends a control signal to the operation end on a forward link, and the operation end feeds back the haptic signal to the control end on a reverse link, so that a closed control loop is formed. Furthermore, to ensure the immersive experience of the user, the entire control loop has very strict requirements on latency and reliability, and thus the haptic communication belongs to an ultra-reliable low-latency communication (URLLC) scenario.

Radio resource allocation, including spectrum access and power control, is a key step in implementing D2D haptic communication, as it directly impacts the latency and reliability performance of the communication. Currently, there is a small body of literature investigating the problem of haptic communication resource management. The literature [ heated Human-in-the-Loop Mobile Networks: A Radio Resource Allocation permission on Haptical Communications, IEEE trans. Wireless.Commun., vol.17, No.7, pp.4493-4508,2018] proposes a greedy algorithm with low complexity and a greedy Resource Allocation scheme close to the optimal solution, respectively, for two Resource Allocation situations of perceptual coding and symmetric design. The document Fog Computing for 5G Tactile Industrial Internet of Things QoE-Aware Resource Allocation Model, IEEE trans. Ind. Informatt, vol.15, No.5, pp.3085-3092,2019 proposes a dynamic Resource Allocation Model based on user experience to cope with haptic communication applications in the Internet of Things, and implements the Model using Java.

However, the above documents are all based on a request-grant mechanism, i.e. the user needs to make a resource scheduling request to the base station and wait for the base station to grant before using the spectrum resource. Obviously, this process will degrade the end-to-end latency to the point that the latency requirements for haptic communication cannot be met.

Disclosure of Invention

The invention mainly aims at the problem that the existing resource allocation scheme can cause the increase of D2D tactile communication time delay and the reduction of reliability, and provides a combined spectrum access and power control method. The method can effectively eliminate the resource scheduling time delay and reduce the error probability of data packet transmission, thereby meeting the time delay and reliability requirements of D2D tactile communication.

The invention provides a method for combining spectrum access and power control in D2D haptic communication, which comprises the following steps:

the method comprises the following steps: for the local tactile communication scene, the control end and the operation end can directly communicate by means of D2D. Thus, the control node and the operation node form a D2D pair, wherein the control node is the source node and the operation node is the destination node. Suppose there are N D2D pairs and K orthogonal resource blocks (N) in a cell>K) Respectively using

And

and (4) showing. Wherein, D2D uses e for the source end and the destination end of n respectively_nAnd r_nRepresenting that all sources and destinations are formed separately

And

each D2D pair randomly accesses one resource block according to the contention-based spectrum access mechanism.

Step two: calculating the packet error rate of the D2D forward link under the competitive access mechanism, namely the destination end r _nPacket error rate of

The forward link is from a source end to a destination end; according to short data packet formula under finite block length, packet error rate

Depending on the destination r_nReceived signal-to-noise ratio of

Destination terminal r under calculation competition access mechanism_nReceived signal-to-noise ratio of

Step three: calculating the packet error rate of the D2D reverse link under the competitive access mechanism, namely the source end e_nPacket error rate of

The reverse link is from a destination end to a source end; according to the short data packet formula under the finite block length, the packet error rate

Dependent on source e_nReceived signal-to-noise ratio of

Source end e under calculation competition access mechanism_nReceived signal-to-noise ratio of

Step four: the packet error rate of the D2D haptic communication closed loop is calculated. According to step two and step three, the probability of successful transmission of the D2D haptic communication closed loop is

Thus, the closed loop packet error rate can be approximated as

Step five: and establishing an optimization problem. The total packet error rate of N D2D pairs in the network is

Based on this, under the contention access mechanism, the optimization goal of power control is to minimize the total packet error rate of the system, i.e. the total packet error rate

s.t.

Wherein, P ═ P (P)₁,...,p_n,...,p_N) For N D2D pairs of transmit power vectors,

consists of the transmitting power of a source end and a destination end, S is the maximum transmitting power grade which can be adopted by the source end and the destination end, and epsilon _thA closed loop packet error rate threshold for each D2D pair.

Step six: the optimization problem is modeled as a game. The optimization problem is a mixed integer nonlinear programming problem and is solved by adopting a game theory. Building (2)The standing game model is

Wherein

In order to be a collection of gaming participants,

in order to be a policy space, the policy space,

is the set of policies for game participant n. u. of_n(p_n,p_-n) Representing the utility function of a game participant n, where p_nFor the transmission power of game participants n, p_-nIs the transmission power of the other gaming participants than gaming participant n. Thus, the proposed power control bet can be modeled as

Step seven: and solving game Nash equilibrium. By adopting a synchronous log linear learning (SLL) algorithm, the optimal Nash equilibrium solution can be converged on the premise of no information interaction and coordination mechanism, namely the sum of the packet error rates of all D2D pairs under a competitive access mechanism is minimized. The SLL algorithm is based on the boltzmann exploration strategy, i.e., the probability of a participant selecting a strategy with higher utility is greater than the probability with a strategy with lower utility. Therefore, the boltzmann exploration strategy is considered to be an effective method for getting rid of local optima and finally achieving global optima.

The invention provides a resource access mechanism based on competition. Specifically, a number of orthogonal resource blocks are first reserved for D2D haptic communication. When a data packet needs to be transmitted, the source end or the destination end can randomly access a resource block on a forward link or a reverse link without sending a resource scheduling request and waiting for authorization of a base station. Although the access mechanism can effectively unload the scheduling delay, due to the random access, when two or more users access the same resource block, mutual interference occurs among the users, and the reliability of transmission is affected. Then, interference among users can be coordinated through reasonable power control, so that the time delay and reliability requirements of D2D haptic communication are met simultaneously.

The invention provides a D2D haptic communication-oriented combined spectrum access and power control method, which can effectively unload resource scheduling delay and reduce the error probability of data packet transmission, thereby meeting the delay and reliability requirements of D2D haptic communication. The method comprises the steps of firstly adopting a frequency spectrum access mode based on a competition mechanism to unload resource scheduling time delay, and then reducing the packet error rate of transmission through reasonable power control under the mechanism. Specifically, the power control problem is converted into a non-cooperative game model, and a distributed learning algorithm is adopted to solve Nash equilibrium.

Drawings

FIG. 1 is a diagram of a D2D haptic communication model of the present invention.

Fig. 2 is a diagram illustrating interference analysis between D2D pairs under the contention mechanism of the present invention.

Fig. 3 is a flow chart of the joint spectrum access and power control of the present invention.

Fig. 4a is a schematic diagram showing the comparison between the algorithm of the present invention and the optimal solution for different iteration numbers about the total packet error rate.

Fig. 4b is a graph showing the power selection comparison of different pairs of D2D for different iterations.

Fig. 5a is a schematic diagram of a relationship between packet error rate and packet length under different algorithm environments.

Fig. 5b is a schematic diagram of a relationship between packet error rate and number of resource blocks in different algorithm environments.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

The D2D haptic communication model is further described with reference to FIG. 1. A D2D pair consists of a source and a destination. In haptic communication, the source terminal is typically a haptic device, i.e., human action, sensation can be converted into a haptic signal input through various encoding techniques. The destination is typically a remote control, i.e. it can interact with the environment and feed back tactile signals to the source.

The communication process of D2D haptic communication is described as follows. First, the source peer transmits a control signal to the destination peer over the forward link. The destination then interacts with the environment based on the received control signals and sends haptic feedback signals to the source via the reverse link, forming a closed-loop D2D haptic communication model. In the present invention, the source and destination of multiple D2D are strictly synchronized in time when transmitting.

Referring to fig. 2, the interference generated between the pair of D2D in the contention mechanism is further analyzed. Analyzing the destination r for the forward link example₁The interference experienced. Considering the finiteness of the transmitting power and the fading characteristics of the wireless channel, each source end can only affect the destination end within a certain coverage range. For example, the pink shaded area can be regarded as the source end e ₃Of the source end e₃Destination ends within the coverage area can be affected, but destination ends outside the coverage area cannot be affected. Assume that 3D 2D pairs in the figure access the same resource block simultaneously. Then, the destination terminal r₁Will be received by the source e₂Without being influenced by the source e₃Due to the destination end r₁At the source end e₂Is not in the source end e₃Within the coverage of (c). Similarly, on the reverse link, the source e₁Will receive the destination end r₃Because of the source end e₁At the destination end r₃Within the coverage of (c).

The implementation steps of the present invention will be further explained with reference to fig. 3:

the method comprises the following steps: suppose that N D2D pairs and K orthogonal resource blocks coexist in a cell and are used respectively

And

and (4) showing. Wherein D2D is divided into source end and destination end of nFor use of e_nAnd r_nRepresenting that all sources and destinations are formed separately

And

Step two: calculating the packet error rate of the D2D forward link (source end to destination end) under the competitive access mechanism, namely the destination end r_nThe packet error rate of (d). Under a competitive access mechanism, a user randomly accesses one resource block in an equal probability mode. At this time, if D2D accesses resource block k to n, the destination r _nAt a received signal-to-noise ratio SNR of

Wherein,

as a source end e_nTransmit power of H (e)_n,r_n) As a source end e_nTo the destination end r_nChannel gain of σ²1/K is the probability that D2D accesses resource block K to i;

is a vector of the source transmit power,

is in a vector P_eAll to the destination end r_nHas a signal-to-noise ratio exceeding gamma_th1The source terminal set of (2); wherein, γ_th1To the destination end r_nIf other sources go to r_nSNR of greater than gamma_th1Then, it will be directed to the destination r_nWill not otherwise interfere with the reception of r_nThe reception of (2) causes an impact; e.g. of the type_mIs DThe source end of the 2D pair m,

as a source end e_mTransmit power of H (e)_m,r_n) As a source end e_mTo the destination end r_nThe channel gain of (a); e.g. of the type_iThe source end of pair i of D2D,

as a source end e_iTransmit power of H (e)_i,r_n) As a source end e_iTo the destination end r_nThe channel gain of (1).

Since D2D randomly accesses resource block n under the contention mechanism, the received signal-to-noise ratio relative to K resource blocks is

According to the short packet formula under the finite block length, the packet error rate of the D2D forward link (source end to destination end) is

Wherein,

is a Gaussian Q function, t is an integral variable, dt is a first derivative of t,

l is the packet length, η is the number of information bits transmitted, and

Step three: calculating the packet error rate of the D2D reverse link (from the destination end to the source end) under the competitive access mechanism, namely the source end e_nThe packet error rate of (d). Under the contention mechanism, the received signal-to-noise ratio of D2D on K resource blocks to n is

Wherein,

for the destination end r_nTransmit power of H (e)_n,r_n) As a source end e_nTo the destination end r_nChannel gain of σ²1/K is the probability that D2D accesses resource block K to i;

for the vector of transmit powers of the destination,

is in a vector P_rDown, all to the source end e_nHas a signal-to-noise ratio exceeding gamma_th2In which γ_th2To the source end e_nA signal-to-noise ratio threshold causing interference; r is_mFor the destination of D2D pair m,

to a destination end r_mTransmit power of H (e)_n,r_m) As a source end e_nTo the destination end r_mThe channel gain of (a); r is_iFor the destination of D2D pair i,

to a destination end r_iTransmit power of H (e)_n,r_i) As a source end e_nTo the destination end r_iThe channel gain of (1).

According to the short packet formula under the finite block length, the packet error rate of the reverse link (destination end to source end) of D2D is

Wherein,

l is the packet length, η is the number of information bits transmitted, and

Due to the fact that

And

is substantially 10^-3～10^-5Of order of magnitude, therefore, the closed loop packet error rate can be approximated as

s.t.

consists of the transmitting power of a source end and a destination end, S is the maximum transmitting power grade which can be adopted by the source end and the destination end, and epsilon_thA closed loop packet error rate threshold for each D2D pair.

Step six: the optimization problem is modeled as a game. The optimization problem is a mixed integer nonlinear programming problem and is solved by adopting a game theory. The game model is

Wherein

In order to be a collection of gaming participants,

in order to be a policy space, the policy space,

is the set of policies for game participant n. Due to the fact that

Therefore, it is not only easy to use

Can be expressed as

u_n(p_n,p_-n) Representing the utility function of a game participant n, where p_nFor the transmission power of game participants n, p_-nIs the transmission power of the other gaming participants than gaming participant n. In this model, u_n(p_n,p_-n) Is defined as

Wherein r is_iFor the destination of D2D pair i,

to a destination end r_iThe packet error rate of (2); e.g. of the type_jBeing the source end of D2D pair j,

As a source end e_jThe packet error rate of (d);

as source end e_nSet of destinations affected when transmitting at transmit power S, i.e.

Wherein

In the same way, the method for preparing the composite material,

is as the destination terminal r_nSet of source terminals affected when transmitting at a transmit power SI.e. by

Wherein

Therefore, u_nThe physical meaning is the sum of the packet error rates of all the sources and destinations affected by the game participant n.

Thus, the proposed power control game can be modeled as

Step seven: and solving game Nash equilibrium. By adopting a synchronous log linear learning (SLL) algorithm, the optimal Nash equilibrium solution can be converged on the premise of no information interaction and coordination mechanism, namely the sum of the packet error rates of all D2D pairs under a competitive access mechanism is minimized. The algorithm is based on the boltzmann exploration strategy, i.e. the probability of a participant selecting a strategy with higher utility is greater than the probability of a strategy with lower utility. Therefore, the boltzmann exploration strategy is considered to be an effective method for getting rid of local optima and finally achieving global optima. The proposed algorithm is as follows:

algorithm 1 power control algorithm based on SLL

Initialization: setting the initial iteration time t as 0, and initializing power vectors p (0) of all game participants as p₁(0),...,p_N(0) Therein of

And is provided with

In addition, a binary indication variable is set

All game participants simultaneously perform the following steps:

Repeat

t＝t+1

If b_n(t-1) ═ 0, game participant n updates its policy according to the following rules

Wherein p' is the strategy taken by the game participant n at time t;

representation collection

Number of middle elements, δ_nThe exploration rate for the gaming participant n. Furthermore, if p_n(t)≠p_n(t-1) setting b_n(t) ═ 1; otherwise, set b_n(t)＝0。

End if

If b_n(t-1) ═ 1, game participant n updates its policy according to the following rules

Wherein, beta is a learning parameter,

and

the utility of the betting participant n at times t-1 and t-2 respectively,

then, set up b_n(t)＝0。

End if

If constraint condition

Cannot satisfy

Setting p (t) ═ p (t-1)

End if

Until

p_nRemain unchanged.

The effect of the invention will be further described with reference to fig. 4 and 5:

the simulation in the invention is based on software MatlabR2016 a. The simulation parameter parameters are set as follows: the number of the D2D pairs is 10, the number of the resource blocks is 3, the actual quantized power set is {0.1,0.2,0.3}, the noise power is-100 dBm, the distance from the source end to the destination end of the D2D is 15m, and the threshold r of the signal-to-noise ratio is_th20dB, 400bits of information bits, and 50channel use of packet length. Thus, the set of policies for each gaming participant is

Fig. 4 analyzes the convergence of the proposed game learning algorithm. In fig. 4a, the proposed algorithm gradually converges to the optimal power selection strategy over a number of iterations. It should be noted that even if the algorithm reaches optimal power selection, it is still possible to jump out of the current scheme to explore unknown areas, as shown by the red circles. Finally, when the number of iterations is large enough, the proposed algorithm will converge. Fig. 4b further illustrates the convergence of the algorithm. We randomly picked 4D 2D pairs from 10D 2D pairs, denoted D2D pair 1, D2D pair 2, D2D pair 3 and D2D pair 4 respectively. When the number of algorithm iterations is large enough, the power selection of each D2D pair will converge to a particular strategy.

Fig. 5 shows the relation between the packet error rate and the packet length and the number of resource blocks, respectively. Wherein "BR" is an optimal response algorithm, i.e. in each iteration of the algorithm, the user selects the strategy that maximizes his own benefit. As shown in fig. 5a, the packet error rate of all algorithms decreases as the packet length increases until it approaches 0. In fig. 5b, as the number of resource blocks increases, the packet error rate of all algorithms is decreasing. This is because, as the number of resource blocks increases, the probability that a user will access the same resource block will decrease. In addition, when K ≧ 10, each user can be assigned a dedicated resource block, so no interference occurs between each other.

Claims

1. A method for combining spectrum access and power control in D2D haptic communication is characterized by comprising the following steps:

the method comprises the following steps: for a local tactile communication scene, a control end and an operation end are directly communicated in a D2D mode, and the control end and the operation end form a D2D pair, wherein the control end is a source end, and the operation end is a destination end; suppose there are N D2D pairs and K orthogonal resource blocks in a cell, where N is>K, respectively using

And

represents; wherein, D2D uses e for the source end and the destination end of n respectively _nAnd r_nMeaning that all sources and destinations constitute e ═ e, respectively₁,...,e_n,...,e_NAnd

each D2D pair randomly accesses a resource block according to a frequency spectrum access mechanism based on competition;

step two: calculating the packet error rate of the D2D forward link under the competitive access mechanism, namely the destination end r_nPacket error rate of

The forward link is from a source end to a destination end; according to the short data packet formula under the finite block length, the packet error rate

Depending on the destination r_nReceived signal-to-noise ratio of

Calculating contestDestination terminal r under contention access mechanism_nReceived signal-to-noise ratio of

Dependent on source e_nReceived signal-to-noise ratio of

Step four: calculating the packet error rate of the D2D tactile communication closed loop, wherein the successful transmission probability of the D2D tactile communication closed loop is that according to the step two and the step three

Thus, the closed loop packet error rate is approximately

Step five: the optimization problem is established, and the total packet error rate of N D2D pairs in the network is

s.t.

consists of the transmitting power of a source end and a destination end, S is the maximum transmitting power grade which can be adopted by the source end and the destination end, and epsilon_thA closed loop packet error rate threshold for each D2D pair;

step six: modeling the optimization problem as a game and establishing a game model as

Wherein

In order to be a collection of gaming participants,

in order to be a policy space, the policy space,

set of policies for game participant n, u_n(p_n,p_-n) Representing the utility function of a game participant n, where p_nFor the transmission power of game participants n, p_-nFor the transmission power of other game participants than game participant n, the proposed power control game is modeled as

Step seven: and (3) solving game Nash equilibrium, adopting a synchronous logarithmic linear learning algorithm, and converging to an optimal Nash equilibrium solution on the premise of no information interaction and coordination mechanism, namely minimizing the sum of the packet error rates of all D2D pairs under a competitive access mechanism.

2. The method of joint spectrum access and power control in D2D haptic communication of claim 1, wherein transmissions of source and destination ends of D2D are synchronized in time.

3. The method of claim 1, wherein the destination-side packet error rate in step two is determined by combining spectrum access and power control in D2D haptic communication

And destination terminal r_nReceived signal-to-noise ratio of

The relationship of (1) is:

wherein,

l is the packet length, eta is the number of information bits transmitted, and

4. the method for joint spectrum access and power control in D2D haptic communication according to claim 3, wherein the destination r is_nReceived signal-to-noise ratio of

The calculation method of (2) is as follows: assuming that D2D accesses resource block k for n at this time, the destination r_nAt a received signal-to-noise ratio SNR of

Wherein,

is a vector of the source transmit power,

is in a vector P_eAll to the destination end r_nHas a signal-to-noise ratio exceeding gamma_th1The source terminal set of (2); wherein, γ_th1To the destination end r_nIf other sources go to r_nSNR of greater than gamma_th1Then, it will be directed to the destination r_nWill not otherwise interfere with the reception of r_nThe reception of (2) causes an impact; e.g. of the type_mThe source end of D2D pair m,

As a source end e_iTransmit power of H (e)_i,r_n) As a source end e_iTo the destination end r_nThe channel gain of (a);

5. The method of claim 1, wherein the source-end packet error rate in step three is determined by combining spectrum access and power control in D2D haptic communication

And source end e_nReceived signal-to-noise ratio of

The relationship of (1) is:

wherein,

l is the packet length, η is the number of information bits transmitted, and

6. the method of claim 5, wherein the source e is a source e for joint spectrum access and power control in D2D haptic communication_nReceived signal-to-noise ratio of

The calculation method of (2) is as follows: under the contention mechanism, the received signal-to-noise ratio of D2D on K resource blocks to n is

Wherein,

to a destination end r_nTransmit power of H (e)_n,r_n) As a source end e_nTo the destination end r_nChannel gain of σ²1/K is the probability that D2D accesses resource block K to i;

for the vector of transmit powers of the destination,

is in a vector P_rDown, all to the source end e_nHas a signal-to-noise ratio exceeding gamma_th2In which γ_th2To the source end e _nA signal-to-noise ratio threshold causing interference; r is a radical of hydrogen_mFor the destination of D2D versus m,

7. The method for combined spectrum access and power control in D2D haptic communication of claim 1, wherein the policy set for game participant n in step six

Can be expressed as

Wherein, S is the maximum transmit power level that the source end and the destination end can adopt.

8. The method for combined spectrum access and power control in D2D haptic communication of claim 1, wherein the utility function u of game participant n in step six_n(p_n,p_-n) Is defined as

Wherein r is_iFor the destination of D2D pair i,

as a source end e_jThe packet error rate of (2);

Wherein

In the same way, the method for preparing the composite material,

is as the destination terminal r_nSet of source terminals affected when transmitting at the transmission power S, i.e.

Wherein

9. The method for combined spectrum access and power control in D2D haptic communication according to claim 1, wherein the power control algorithm based on log-linear learning of synchronization in step seven is as follows:

initialization: setting the initial iteration time t as 0 and initializing all gamesParticipant's power vector p (0) ═ { p { (p) }₁(0),...,p_N(0) Therein of