将Unicode代理对转换为文字字符串

时间:2018-10-01 03:42:41

标签: c# .net unicode unicode-escapes

我正在尝试将高Unicode字符从一个字符串读入另一个字符串。为简便起见,我将简化我的代码,如下所示:

import java.util.ArrayList;
import javax.swing.plaf.basic.BasicButtonListener;
import javafx.event.ActionEvent;
import javafx.event.Event;
import javafx.event.EventHandler;
import javafx.geometry.Insets;
import javafx.scene.Group;
import javafx.scene.Scene;
import javafx.scene.control.Button;
import javafx.scene.control.Label;
import javafx.scene.control.TextArea;
import javafx.scene.control.TextField;
import javafx.scene.layout.GridPane;
import javafx.scene.layout.HBox;
import javafx.scene.paint.Color;
import javafx.scene.text.Font;
import javafx.scene.text.FontWeight;
import javafx.scene.text.Text;
//import all other necessary javafx classes here
//----

public class InputPane extends HBox
{
    //GUI components
    private ArrayList<Laptop> laptopList;

    //The relationship between InputPane and PurchasePane is Aggregation
    private PurchasePane purchasePane;
    //----
    private GridPane Gpane, RightPane;
    private Label label, l2, l3, l4, l5, errL;
    private Button btn1;
    private TextField text, t2, t3, t4, t5;
    private TextArea ta;

    //constructor
    public InputPane(ArrayList<Laptop> list, PurchasePane pPane)
    {
        laptopList = list;
        purchasePane = pPane;

        //Step #1: initialize each instance variable and set up the layout
        //----
        //create a GridPane hold those labels & text fields
        //consider using .setPadding() or setHgap(), setVgap()
        //to control the spacing and gap, etc.
        //----

        Gpane = new GridPane();
        Gpane.setHgap(10);
        Gpane.setVgap(10);
        Gpane.setPadding(new Insets(30, 30, 10, 20));

        label = new Label("Brand");
        l2 = new Label("Model");
        l3 = new Label("CPU(GHz)");
        l4 = new Label("RAM(GB)");
        l5 = new Label("Price($)");

        Gpane.add(label, 0, 0);
        Gpane.add(l2, 0, 1);
        Gpane.add(l3, 0, 2);
        Gpane.add(l4, 0, 3);
        Gpane.add(l5, 0, 4);

        //You might need to create a sub pane to hold the button
        //----
        btn1 = new Button("Enter a Laptop Info");
        btn1.setOnAction(new ButtonHandler());

        Gpane.add(btn1, 1, 5);
        //Set up the layout for the left half of the InputPane.
        //----
        text = new TextField();
        t2 = new TextField();
        t3 = new TextField();
        t4 = new TextField();
        t5 = new TextField();
        Gpane.add(text, 1, 0);
        Gpane.add(t2, 1, 1);
        Gpane.add(t3, 1, 2);
        Gpane.add(t4, 1, 3);
        Gpane.add(t5, 1, 4);

        errL = new Label("");
        errL.setVisible(false);
        Gpane.add(errL, 0, 0);
        //the right half of the InputPane is simply a TextArea object
        //Note: a ScrollPane will be added to it automatically when there are no
        //enough space
        RightPane = new GridPane();
        ta = new TextArea();
        ta.setPromptText("No laptops");
        ta.setPrefColumnCount(30);
        ta.setPrefRowCount(20);
        RightPane.add(ta, 8, 0);
        //Add the left half and right half to the InputPane
        getChildren().add(Gpane);

        getChildren().add(RightPane);
        //Note: InputPane extends from HBox
        //----

        //Step #3: register source object with event handler
        //---

    } //end of constructor

    //Step #2: Create a ButtonHandler class
    //ButtonHandler listens to see if the buttont "Enter a Laptop Info." is
    //pushed or not,
    //When the event occurs, it get a laptop's brand, model, CPU, RAM and price
    //information from the relevant text fields, then create a new Laptop
    //object and add it inside
    //the laptopList. Meanwhile it will display the laptop's information
    //inside the text area.
    //It also does error checking in case any of the textfields are empty or
    // wrong data was entered.
    private class ButtonHandler implements EventHandler<ActionEvent>
    {
        //Override the abstact method handle()
        @Override
        public void handle(ActionEvent e)
        {
            //declare any necessary local variables here
            //---
            String Brand, Model, CPU, RAM, Price;
            Brand = text.getText();
            Model = t2.getText();
            CPU = t3.getText();
            RAM = t4.getText();
            Price = t5.getText();

            //when a text field is empty and the button is pushed
            if
            (text.equals("")||t2.equals("")||t3.equals("")||t4.equals("")||
                    t5.equals(""))
            {
                errL.setText("Empty Fields");
                errL.setVisible(true);
            }

            else    //for all other cases
            {
                try {
                    Laptop lap = new Laptop(Brand, Model, Double.parseDouble(CPU),
                            Double.parseDouble(RAM), Double.parseDouble(Price));
                    laptopList.add(lap);
                    ta.appendText(lap.toString());
                    errL.setText("Laptop added");
                    text.setText(""); t2.setText(""); t3.setText("");
                    t4.setText(""); t5.setText("");
                    //----
                    //at the end, don't forget to update the new arrayList
                    //information on the ListView of the Purchase Pane
                    //----
                    purchasePane.updateLaptopList(lap);
                    //Also somewhere you will need to use try & catch block to catch
                    //the NumberFormatException

                }catch (NumberFormatException l) {
                    System.err.println("Numbers only");
                }
            }

        } //end of handle() method
    } //end of ButtonHandler class
}

当我直接将try{ let vari = obj.propTest; // obj may be don't have propTest property ... } catch(NullException){ // do something here } 分配给public static void UnicodeTest() { var highUnicodeChar = ""; //Not the standard A var result1 = highUnicodeChar; //this works var result2 = highUnicodeChar[0].ToString(); // returns \ud835 } 时,它将保留其字面值highUnicodeChar。当我尝试通过索引访问它时,它返回result1。据我了解,这是一对用于表示UTF-32字符的UTF-16字符的替代。我很确定这个问题与尝试将隐式转换为\ud835有关。

最后,我希望char产生与string相同的值。我该怎么办?

2 个答案:

答案 0 :(得分:22)

在Unicode中,您有个代码点。这些是21位长。您的角色“数学粗体A”的代码点为U + 1D400。

在Unicode编码中,您有个代码单位。这些是编码的自然单位:UTF-8为8位,UTF-16为16位,依此类推。一个或多个代码单元对单个代码点进行编码。

在UTF-16中,形成单个代码点的两个代码单元称为<代理代理对。代理对用于编码大于16位(即U + 10000及以上)的任何代码点。

这在.NET中有些棘手,因为.NET Char代表单个UTF-16代码单元,而.NET String是代码单元的集合。

因此您的代码点(U + 1D400)不能容纳16位,并且需要一个代理对,这意味着您的字符串中包含两个代码单元:

var highUnicodeChar = "";
char a = highUnicodeChar[0]; // code unit 0xD835
char b = highUnicodeChar[1]; // code unit 0xDC00

意思是当您像这样对字符串进行索引时,实际上您只获得了代理对的一半。

您可以使用IsSurrogatePair测试代理对。例如:

string GetFullCodePointAtIndex(string s, int idx) =>
    s.Substring(idx, char.IsSurrogatePair(s, idx) ? 2 : 1);

重要的是要注意Unicode变量编码的“兔子洞”并没有在代码点结束。 字素簇是大多数人被问到的最终“字符”的“可见事物”。字素簇由一个或多个代码点组成:一个基本字符和零个或多个组合字符。组合字符的一个示例是变音符号或您可能要添加的其他各种修饰/修饰符。有关合并字符可以做什么的可怕示例,请参见this answer

要测试组合字符,可以使用GetUnicodeCategory来检查是否有封闭标记,非空格标记或空格标记。

答案 1 :(得分:8)

似乎您要从highUnicodeChar字符串中从用户角度提取第一个“原子”字符(即第一个Unicode grapheme cluster),其中“原子”字符包括两个字符surrogate pair的一半。

您可以使用StringInfo.GetTextElementEnumerator()来做到这一点,将string分解成原子块,然后取第一个。

首先,定义以下扩展方法:

public static class TextExtensions
{
    public static IEnumerable<string> TextElements(this string s)
    {
        // StringInfo.GetTextElementEnumerator is a .Net 1.1 class that doesn't implement IEnumerable<string>, so convert
        if (s == null)
            yield break;
        var enumerator = StringInfo.GetTextElementEnumerator(s);
        while (enumerator.MoveNext())
            yield return enumerator.GetTextElement();
    }
}

现在,您可以这样做:

var result2 = highUnicodeChar.TextElements().FirstOrDefault() ?? "";

请注意,StringInfo.GetTextElementEnumerator()还将对Unicode combining characters进行分组,因此字符串Ĥ=T̂+V̂的第一个字素簇将是而不是H

提琴here