From: David Taylor Sent: Friday, 5 October 2001 7:38 PM To: jo.lim@auda.org.au Subject: Comments on Registry Technical Spec in Email (and also attached txt file) - Thanks. Response to Draft .AU Registry Technical Specification ------------------------------------------------------ Author: David Taylor The Registry Technical Specification is an important document defining key technical standards to which registries and registrars must comply. These standards will have a significant impact on Australian registries and registrars for years to come. It is important to ensure requirements documented in the specification are based on broad consensus of the Australian internet industry. I note this is the first time the Australian industry has been given a chance to contribute to the technical specification. It would have been more appropriate for auDA to have asked for industry input earlier in the process. To achieve a smooth transition from the current legacy Australian domain name systems to an industry best practice system, it is important that the recommendations of individuals and organizations with past experience in the introduction of competition in the global environment be incorporated into the registry technical specification. I have been a Melbourne IT employee for 5 years, working on com.au, gTLD registrar and gTLD registry projects. In 1998/1999 I was the technical lead on a project to build what became the system supporting the Melbourne IT gTLD Registrar business. I also authored a substantial technical section of the .BIZ proposal to ICANN, and authored the .BIZ registry business requirements before NeuLevel was formed as an independent entity. Because of the relevance of my past experience, I am submitting these comments separately from Melbourne IT's formal comments (submitted by the Melbourne IT CTO Dr Bruce Tonkin with input from Matt Narayan, one of Melbourne IT's domain name policy experts). I was intending on going into a greater level of detail as to why EPP should be used within the au namespace. However in the open technical meeting on October 2, I noted unanimous support within the industry for immediately using EPP (instead of using an interim protocol now and transitioning to EPP at some later date). I now fully expect the registry technical specification to be redrafted taking into consideration all the good work of the IETF PROVREG working group and the EPP protocol. This will accelerate auDA's efforts with the introduction of competition by ensuring the process does not get delayed while deficiencies in the protocol are debated, and will alleviate the need to develop toolkits for a new protocol. In general, the registry technical specification is a well drafted document. However, there are a number of areas where modifications need to be made. This comment document will discuss the following issues: * Why the protocol should be based on XML. * Why EPP should be mandated in the Registry Technical Specification. * Database record format issues. * Important Registry scalability issues (relating to the protocol). * Other Issues * Conclusion. The protocol should be XML based -------------------------------- The protocol should be based on XML and related technologies (XML Schema, XML Namespaces, etc). Aside from being a standard way of formatting information (and describing and validating that formatting), XML provides additional benefits including: * XML based protocols removes the need to write parsers. XML parsers are ubiquitous on all platforms and programming environments. * XML Schema allows incoming and outgoing messages to be automatically verified for correctness of structure and content (XML Schema allows type information to be formally defined). * The strong typing of XML Schema provides the ability to automatically generate native objects in any language (ie Java, C++, etc). Native classes can be automatically generated based on an XML Schema definition, allowing either client or server side systems to be rapidly constructed, insulating developers from having to work directly with XML. This feature actually makes it easier working with an XML based protocol compared to a protocol like IRRP. * XML Namespaces provide an elegant extensibility mechanism. (For more information see the EPP specification). These benefits not doubt contributed to the decision to use XML for EPP. Australia can immediately obtain these benefits simply by adopting EPP. I have personally run the EPP v1.4 schema through an automated class generator and produced objects that can immediately process EPP messages. These native objects allow incoming messages to be de-serialized, and outgoing messages to be serialized from their native object equivalents. Note: For those wishing to try this, I suggest downloading IBM's Web Services Toolkit (for Java), Microsoft's .NET platform, or one of the many implementations available for other languages and platforms. I have in a matter of hours, experimented generating a server capable of responding to EPP requests using these readily available development tools. Without a full appreciation of modern XML software development tools, it may appear that a protocol like IRRP would be easier to implement than EPP. However the opposite is in fact true; an XML based protocol like EPP is actually simpler to use, even for small registrars with no previous experience in the market. The benefits I have described regarding XML based protocols such as EPP would be true even if open source EPP toolkits were not available. The EPP toolkits make the protocol even simpler to use. It should be noted that many similar protocols have been standardising on XML due to the support infrastructure available in most modern development platforms. The Registry Technical Specification Should Adopt EPP ----------------------------------------------------- The specification states that IRRP will be used as an interim protocol until EPP has been adopted as an industry standard, after which it will also be supported. The technical specification seems to underestimate the level of completeness in the EPP specification. The document states IRRP will always be supported, forever adding a marginal cost to the management of domain names in Australia. I am sure many of us have been involved in projects where statements have been made regarding migrating to some different protocol in the future; however in production environments this very rarely happens. Once a data schema and protocol have been selected, the Australian internet industry will be living with those choices for many years into the future. The registry technical specification gives five reasons why EPP has not been mandated at the registry protocol. The first three reasons are related to registry configuration and policy, and really have nothing to do with the choice of EPP (or some other protocol). The fourth reason is that EPP has not yet been adopted as an industry standard. The table presenting various protocols used by different TLD's does not give an accurate indication of the market acceptance of EPP. The table shows a number of registries using legacy systems that I am sure would be envious of auDA's current position of being able to "hit the reset button" and move to a more capable protocol and system. The table should have pointed out that Afilias (.info), NeuLevel (.biz), Global Name Registry (.name), RegistryPro (.pro), VeriSign (who designed the original EPP specification and will eventually migrate com, net and org over to EPP), and a number of additional registries are all using or planning to use EPP. It is fair to say that EPP is a de-facto industry standard, given it is currently being used in production on some of the worlds largest registries (the new gTLDs). EPP will no doubt be adopted as the IETF standard protocol for the domain name registration industry. The final reason given for not embracing EPP is the statement that no public domain EPP toolkit is currently available. This may have been true when an initial draft of the registry technical specification was drafted; however it is certainly not true now as there are a number of publicly available implementations. I looked on SourceForge and found the following: BSD Style License: neulevel-rtk-j (will release over next week) LGPL Licence(Afilias): epp-rtk GUI EPP Front-End: eppinterpreter Note: NeuLevel is currently in the process of getting a BSD styles license written and will then upload the toolkit to SorceForce. If the NeuLevel package is not yet available for download from SourceForge, ask auDA for a copy (we will ensure they have the current version of the EPP registrar toolkit). Note: I am under the impression the .NAME toolkit is also open source. EPP is a powerful protocol, perfectly suited to the Australian environment. EPP has a number of predefined objects (Domain Name, Contact, Host, etc) that can either be used as defined, or extended as needed. In addition, auDA is free to define additional objects with EPP mappings. The Melbourne IT research and development groups have previously developed three XML based domain name protocols (only 1 of which was publicly launched into production), including protocols specifically designed for the Australian environment. We have contributed some of this experience to the author of EPP doing our part to ensure the EPP protocol was designed with attributes appropriate for the local environment. It is important to note that EPP has been designed to support both gTLDs and ccTLDs. The Australian market should leverage off EPP and the combined knowledge of members of the PROVREG working group, which consists of employees of many experienced domain name registries and registrars. By following this path, auDA will avoid the past mistakes of others registries, and ensure a smooth transition to competition. Database Record Format (APPENDIX A) ----------------------------------- It is of vital importance the registry technical specification correctly defines registry objects, object fields, and object relationships. This needs to be done right the first time, as changes to schema formats at a later date will cause significant problems for the industry. It will be difficult enough changing from the current AUNIC format to the format in the registry technical specification, let alone to repeat the same exercise again 6-12 months later. For example, note that even end user registrants will have to gain some understanding of what is happening to AUNIC, AUNIC NIC-HANDLES, etc. It is also worth noting that changing schema formats at a later date would probably be more difficult than changing registry protocols. Thus it is imperative this is done right the first time. Adopting the base EPP objects (such as domains, contacts, hosts, etc) and extending them as required to suite the local environment makes more sense than trying to define these objects from scratch. The duplicated state and country fields in the Registrar, Registrant and Contact records are not necessary and will lead to database inconsistencies. This is most certainly not best practice. Such fields should be (and of course always are) presented in plain english to registrants in emails and on web pages (ie AU = AUSTRALIA); however protocol and database design should not have these duplicate fields. The 2 digit country code should remain (and the english string should be removed). The state code field should be removed, and the 50 character string should remain (so the system will support states outside Australia). However there should be a policy (implemented by the registry as a validation check) that if the country code = "AU", the state field must equal one of ["NSW","VIC","SA",etc] or the command will be rejected. This achieves auDA's goal and removed any duplicated fields. REGISTRANT RECORD The registrant record does not define fields to collect information for scenarios where a company owns a number of businesses and wants to register a domain name for the parent company and each registered business name (Bruce Tonkin elaborated on this in detail in Melbourne IT's response). If these records are not redefined, the technical specification should document what should happen when the current AUNIC data is converted into this format. (ie. is this data simply discarded?) Note: The registrant record is a great example of where an XML representation would allow different types registrants to be elegantly represented. DOMAIN RECORD The domain record defines up to 8 nameserver delegations. Although this is no doubt adequate for 95% of cases, why doesn't the technical specification support the full 13 possible servers? If the EPP domain object is supported, the full number of allowable servers can be supported in an elegant way. Past experience with large registries (such as com/net/org) has revealed the many problems in collecting IP address information for delegated nameserver. Problems include database consistency errors and registrant confusion. For example, if a domain name is delegated to nameservers in another namespace (ie ns1.nameserver.com, ns2.nameserver.com), and the hosting company changes the nameserver's IP address at a future date, the au registry would then be storing an incorrect IP address. It is also worth noting that end user registrants rarely know the IP address of the nameserver to which their domain is to be delegated. Obviously, the technical specification must still solve the 'glue record' issue. The possibilities include: *Collect IP addresses; but only if the delegated nameserver is in the subdomain of the registered domain name. (if the nameserver is not inside the subdomain and IP addresses are supplied, the command should be rejected). *Add a new HOST RECORD (as per EPP). If EPP is adopted, the standard EPP Host object solution will eliminate this problem. CONTACT RECORD Having each Contact records point to a Domain record means that if an entity registers 100 domain names, the same data will be duplicated 100 times. It would be preferable to have the concept of a registrant account, where a number of contact records are associated with that account (admin, tech, etc), and where domain names are also associated with that registrant account. This will reduce duplication in the database, and as a general rule - end users understand the concept of an 'account' with associated services (such as domain name registration) belonging to an account. The current schema needs to be redesigned (and more discussion needs to be held with industry participants to build consensus). Having a fax number as a mandatory field is not recommended. While most companies have fax numbers, many individuals will not have a fax number. It is also worth considering that fax is an older technology that may slowly give way to email (in other words a lesser percentage will have a fax machines in 5 years). The contact record does not appear to cater for 'Virtual Contacts'. It is important to note that many companies use virtual aliases such as "Manager@example.com.au". It is obviously easy to fudge a record, but is this what auDA wants by design? As a general note, adopting EPP (and extending the protocol as required) will solve many issues with the current schema. Registry Scalability Issues --------------------------- A. Withholding Domain Names Based On Domain Enquire Command ----------------------------------------------------------- Appendix B defines the IRRP protocol to reserve domain names when their availability is checked using the domain-enquire action command. It is very clear why the protocol designers decided this was a good idea; When a registrant checks the availability of a domain name, it will often take an additional 1-5 minutes for the registrant to complete a registration form (possibly with credit card payment details). During this time it is possible, but highly unlikely, that another registrant registers the domain name, causing confusion for the original registrant who had been informed that the domain name was available a few minutes before. However the protocol designers may not appreciate the negative ramifications of this protocol design decision. Based on our experience with registry and registrar systems, this will cause a number of problems. Technically advanced registries usually find the percentage of domain-enquire (check availability) commands is somewhere between 95% and 99% of the total number of commands received by the registry. It is fair to expect this will be replicated in Australia with the introduction of new and more capable registry systems. Thus a large registry looking into the future may wish to design their system to handle 1 billion protocol commands per month, but between 950 million and 990 million are only check availability commands. To implement the reservation requirement currently in the technical specification would require that every one of these domain-enquire commands hit the central registry database, causing scalability and possibly reliability problems. The benefit of not requiring reservation of domain names is that domain name availability can be cached on a set of front-end servers, without having to forward the queries through to the central database server (and thus removing more than an order of magnitude of load from the database). Changes to domain name availability can be replicated out to the front-end servers in a matter of seconds using intelligent algorithms. This type of flexibility allows registries to produce more scalable and reliable systems at a significantly lower cost for the back-end database component. There are other reasons why reservation of domain name is not recommended: * It could very easy be used to abuse the system (for example, after the WTC disaster, someone could reserve hundreds of combinations of wtc*.net.au and then take their time deciding which domains to actually pay for.) * In a competitive environment it is very common for registrants to shop around (between different registrars). It would be very confusing if a registrant checked the availability of a domain name through Registrar A, but decided to purchase the domain through Registrar B, only to find it is reserved! While I understand the scenario the protocol designers were trying to accommodate, in practice it occurs in less that 0.01% of cases, and the disadvantages of reserving domain names far outweigh the advantages. B. Using Message Sequence Numbers --------------------------------- The IRRP protocol specifies that messages must include a sequence number. The IRRP server is specified to reject out of sequence update messages. It is important to point out that whilst this mechanism will achieve the effect desired by the protocol designers, EPP solves the same database integrity issue without needing sequence numbers (purely based on protocol command design). The sequence number technique can cause scalability problems for registrars using a scale-out architecture (which has become the dominant way to build reliable internet systems). Note that it is common for large registrars to have a farm of redundant application servers. Channel partners of these registrars will be submitting requests to the registrar. These requests are usually load balanced across the farm of application servers. Under this common, highly reliable configuration, the registry (IRRP) sequence number requirement means the registrar must synchronise outgoing registry messages (emitted from the application server farm), which could cause scalability issues (and lower the reliability of registrar systems). These problems will be solved simply by using EPP. C. Integer Size --------------- The protocol specifies 2^32 as the maximum integer number (for example as sequence numbers). It should be noted the com/net/org registry performs close to 1 billion RRP commands each month. If auDA is planning a registry for the future, this limit may not be sufficient (2^64 is a more appropriate figure). Other Issues ------------ Need Registrant Email Address: There have been many problems in the gTLD market (and even within .au) with domain name resellers (channel partners) putting themselves as both the 'Admin' and 'Tech' contact for the domain name. It is very difficult to control this issue (policy is usually not enough because it is often ignored). The auDA technical specification currently only specifies the snail mail address for the domain name registrant. It is important that a "Registrant Contact" is associated with every domain name (either a third contact object, or additional contact fields added to the registrant object). The only other option would be for auDA to make the registrant agreement "black and white" regarding who the admin contact may be (in other words - without a registrant contact, auDA would have to implement a policy that the admin contact must be the actual owner and not an agent). Whois Format: The format of Whois data should be defined. This is of particular importance in a multi-registry environment. The incompatibility of Whois data has caused many problems in the gTLD registrar market when attempting the transfer of registrar function. XML Whois Format: Depending on the configuration of registrar access to contact data, it may be important to support an XML formatting of Whois information. Long term, this is a feature auDA should require. Data Escrow: Should ideally be based on standard EPP objects and/or based on the format ICANN defined for in the latest gTLD contracts. Domain Name Expiry, Deletion and Holding Periods: The process needs more definition and should mirror the gTLD registries (.com, .biz, etc). IRRP Cascading Deletes: While it may reduce bandwidth requirements, it is dangerous to support cascading deletes in a protocol such as IRRP. In addition, there is some bizarre behavior scenarios defined for delete-contact and delete-domain. This urgently needs additional review. Of course, adopting EPP will make this issue irrelevant. Conclusion ---------- The Registry Technical Specification is a well written document, and the authors have done a good job considering the ambitious schedule. However improvements can be incorporated in a number of areas. Most importantly these include requiring use EPP, and ensuring correctly defined objects in the database record format section. I strongly urge auDA to release the next version of the technical specification for public review and to call another meeting similar to the first technical review meeting to ensure industry consensus. Thank you for considering these comments. David Taylor Melbourne IT