Samba4/DRS TODO List: Difference between revisions

From SambaWiki
No edit summary
 
No edit summary
Line 1: Line 1:
===Coin w2k8 to samba4 dc===
===Coin w2k8 to samba4 dc===

We've been concentrating up to now on Samba4<->Samba4 replication, and
We've been concentrating up to now on Samba4<->Samba4 replication, and
Samba4<->Windows replication where the Samba4 server joins the Windows
Samba4<->Windows replication where the Samba4 server joins the Windows
Line 10: Line 9:


===Create connection object (nTDSConnection)===
===Create connection object (nTDSConnection)===

Our KCC implementation (in source4/dsdb/kcc) is very simple at the
Our KCC implementation (in source4/dsdb/kcc) is very simple at the
moment. It should work by creating nTDSConnection objects under the
moment. It should work by creating nTDSConnection objects under the
Line 21: Line 19:


===Update to new doc release===
===Update to new doc release===

We should look through the new WSPP docs release (from August 2009)
We should look through the new WSPP docs release (from August 2009)
and see what we haven't implemented yet, forming a more extensive todo
and see what we haven't implemented yet, forming a more extensive todo
Line 28: Line 25:
(especially MS-DRSR and MS-ADTS) are a good source of information.
(especially MS-DRSR and MS-ADTS) are a good source of information.


===why isn't repsTo written by Windows?===
===Why isn't repsTo written by Windows?===

I have noticed that Windows is not sending us a DsUpdateRefs to update
I have noticed that Windows is not sending us a DsUpdateRefs to update
the repsTo when we join a Windows domain as a 2nd DC. This means if we
the repsTo when we join a Windows domain as a 2nd DC. This means if we
Line 40: Line 36:
is not sending us DsUpdateRefs messages. Perhaps related to the lack
is not sending us DsUpdateRefs messages. Perhaps related to the lack
of nTDSConnection objects?
of nTDSConnection objects?



===repadmin.exe tool===
===repadmin.exe tool===

The repadmin.exe tool on windows is a great way of seeing the status
The repadmin.exe tool on windows is a great way of seeing the status
of replication. We would like to get all of the options of repadmin
of replication. We would like to get all of the options of repadmin
working when directed at a Samba4 DC. Anatoliy is working on making
working when directed at a Samba4 DC. Anatoliy is working on making
some of the fnuctions work, but there are plenty more to do.
some of the fnuctions work, but there are plenty more to do.



===hook delete in repl_meta_data===
===hook delete in repl_meta_data===

Right now we just pass delete operations down through the
Right now we just pass delete operations down through the
repl_meta_data module to the ldb_tdb backend. That means that deletes
repl_meta_data module to the ldb_tdb backend. That means that deletes
Line 63: Line 55:
task that would run once a day to really delete expired tombstone
task that would run once a day to really delete expired tombstone
records.
records.



===phantom objects===
===phantom objects===

If DRS replication adds a link to a object that doesn't exist we are
If DRS replication adds a link to a object that doesn't exist we are
supposed to create a "phantom" object, which gets filled in later. We
supposed to create a "phantom" object, which gets filled in later. We
Line 74: Line 64:


===Sort objects on disk===
===Sort objects on disk===

Some sysadmins might write scripts that rely on the return order of
Some sysadmins might write scripts that rely on the return order of
attributes within objects (eg. objectclass first). We sort objects on
attributes within objects (eg. objectclass first). We sort objects on
Line 81: Line 70:


===Speed up replmd_ldb_message_element_attid_sort===
===Speed up replmd_ldb_message_element_attid_sort===

The replmd_ldb_message_element_attid_sort function is pretty
The replmd_ldb_message_element_attid_sort function is pretty
inefficient. We need to avoid the attribute lookups in the sort
inefficient. We need to avoid the attribute lookups in the sort
Line 87: Line 75:


===Don't allow replication of readonly attribs===
===Don't allow replication of readonly attribs===

We should not allow replication to overwrite readonly
We should not allow replication to overwrite readonly
attributes. There are other attribute flags we aren't honouring as
attributes. There are other attribute flags we aren't honouring as
Line 94: Line 81:


===Support RODC===
===Support RODC===

A RODC (read-only domain controller) is a potentially very useful use
A RODC (read-only domain controller) is a potentially very useful use
case for Samba4. There is quite a lot of changes in replication and
case for Samba4. There is quite a lot of changes in replication and
Line 100: Line 86:


===Separate gc partition===
===Separate gc partition===

Right now the gc partition is just an amalgamation of the normal base
Right now the gc partition is just an amalgamation of the normal base
partitions, with no filtering (we just set the magic control to say
partitions, with no filtering (we just set the magic control to say
Line 111: Line 96:


===If modify sets attrib to same value then no replPropertyMetaData change===
===If modify sets attrib to same value then no replPropertyMetaData change===

A modify via DRS replication that asks for an attribute to change to
A modify via DRS replication that asks for an attribute to change to
the same value it already has should be filtered out by
the same value it already has should be filtered out by
Line 118: Line 102:


===Fix error mapping (no FOOBAR, and replmd_replicated_request_werror)===
===Fix error mapping (no FOOBAR, and replmd_replicated_request_werror)===

We have lots of code that returns WERR_FOOBAR or NT_STATUS_FOOBAR
We have lots of code that returns WERR_FOOBAR or NT_STATUS_FOOBAR
because we didn't know what error to return. We need to go through
because we didn't know what error to return. We need to go through
Line 126: Line 109:


===Parentguid fix===
===Parentguid fix===

We store parentGUID in the object on disk at the moment, whereas we
We store parentGUID in the object on disk at the moment, whereas we
should construct it at runtime when asked for.
should construct it at runtime when asked for.


===Honor attribute replication flag===
===Honor attribute replication flag===

There is a attribute flag for whether particular attributes should be
There is a attribute flag for whether particular attributes should be
replicated. We need to check that we get this right.
replicated. We need to check that we get this right.


===Double cn fix===
===Double cn fix===

When we do a s4<->s4 vampire we end up with the rDN attribute
When we do a s4<->s4 vampire we end up with the rDN attribute
appearing twice on all objects in the new replica. We think this is
appearing twice on all objects in the new replica. We think this is
Line 143: Line 123:


===check for parent exists in replication add and rename===
===check for parent exists in replication add and rename===
During replication add and rename we need to check that the

during replication add and rename we need to check that the
destination parent exists.
destination parent exists.


===Nandle add where DN exists, but different GUID===
===Handle add where DN exists, but different GUID===

We may need to handle the case where a DRS replication comes in for a
We may need to handle the case where a DRS replication comes in for a
DN that exists, but with a different GUID. We need to test with
DN that exists, but with a different GUID. We need to test with
Line 154: Line 132:


===Net commands to query repl status (via DRS?)===
===Net commands to query repl status (via DRS?)===

We should add net commands for querying the replication status
We should add net commands for querying the replication status
(somewhat like repadmin.exe).
(somewhat like repadmin.exe).


===Max number of attributes on objects?===
===Max number of attributes on objects?===

Metze noticed that the WSPP docs specify a maximum object size in AD
Metze noticed that the WSPP docs specify a maximum object size in AD
of around 8k. This seems to translate into a maximum number of
of around 8k. This seems to translate into a maximum number of
Line 166: Line 142:


===Obey acls on objects===
===Obey acls on objects===

We need to obey the ntSecurityDescriptor on objects in our SAM. This
We need to obey the ntSecurityDescriptor on objects in our SAM. This
is a large task! Nadia is working on it and hopefully will merge soon.
is a large task! Nadya is working on it and hopefully will merge soon.


===Fix ldb_add ojbectclass sorting===
===Fix ldb_add ojbectclass sorting===

In ldb_add we sort objectClass attributes in the objectclass
In ldb_add we sort objectClass attributes in the objectclass
module. The sort is currently horribly inefficient - it needs redoing
module. The sort is currently horribly inefficient - it needs redoing
Line 177: Line 151:


===-s option to setup_dns.sh===
===-s option to setup_dns.sh===

The setup_dns.sh should be redone as a python wrapper so it obeys
The setup_dns.sh should be redone as a python wrapper so it obeys
standard options like -s and can read smb.conf
standard options like -s and can read smb.conf


===what triggers initial kcc run on windows after we join a w2k8 DC?===
===what triggers initial kcc run on windows after we join a w2k8 DC?===

After we join a s4 DC to a windows domain, we've noticed that w2k8
After we join a s4 DC to a windows domain, we've noticed that w2k8
needs to be prompted to run its KCC using "repadmin /kcc". We need to
needs to be prompted to run its KCC using "repadmin /kcc". We need to
Line 188: Line 160:


===s4<->s4 in make test===
===s4<->s4 in make test===

We should add the s4<->s4 vampire and replication in make test
We should add the s4<->s4 vampire and replication in make test




===Urgent replication===
===Urgent replication===

We need to add the urgent bit on replications that have changed
We need to add the urgent bit on replications that have changed
critical objects (see the docs for a list). We will probably need to
critical objects (see the docs for a list). We will probably need to
Line 199: Line 169:


===Group policies===
===Group policies===

We are not currently obeying group policies, although we can serve
We are not currently obeying group policies, although we can serve
them out to clients. We need to obey the ones that make sense for
them out to clients. We need to obey the ones that make sense for
Line 206: Line 175:
needed changes.
needed changes.


===Linked attribotes===
===Linked attributes===

We currently accept the w2k8 linked attributes in replication, but
We currently accept the w2k8 linked attributes in replication, but
when other DCs replicate to us we serve up linked attributes as normal
when other DCs replicate to us we serve up linked attributes as normal
Line 215: Line 183:


===Add support for ndr64 to wireshark===
===Add support for ndr64 to wireshark===

When watching w2k8-R2 <-> w2k8-R2 interactions, windows chooses
When watching w2k8-R2 <-> w2k8-R2 interactions, windows chooses
NDR64 instead of NDR. We now support NDR64 in Samba, but wireshark
NDR64 instead of NDR. We now support NDR64 in Samba, but wireshark
Line 222: Line 189:


===Convert wireshark drsuapi to pidl===
===Convert wireshark drsuapi to pidl===

The DRSUAPI decoder in wireshark is quite poor. We should redo it
The DRSUAPI decoder in wireshark is quite poor. We should redo it
using a pidl based parser.
using a pidl based parser.


===Fix decryption of w2k8 by wireshark (krb5 patch)===
===Fix decryption of w2k8 by wireshark (krb5 patch)===

Wehn watching w2k8 <-> samba traffic in wireshark we often find that
Wehn watching w2k8 <-> samba traffic in wireshark we often find that
wireshark cannot decrypt some of the traffic. This is due to a
wireshark cannot decrypt some of the traffic. This is due to a
Line 235: Line 200:


===bitmap32 actually 3264 in samr QueryUserInfo level 16? (netmon bug too)===
===bitmap32 actually 3264 in samr QueryUserInfo level 16? (netmon bug too)===

There seems to be a problem with the QueryUserInfo level 16 and
There seems to be a problem with the QueryUserInfo level 16 and
NDR64. The Microsoft netmon 3.3 parser has the same problem as our
NDR64. The Microsoft netmon 3.3 parser has the same problem as our
Line 241: Line 205:


===How does another DC become the FSMO master and RID master===
===How does another DC become the FSMO master and RID master===

We need to work out how a DC should become the FSMO master and RID
We need to work out how a DC should become the FSMO master and RID
master. We can do it now via ldbedit, but there should be a more
master. We can do it now via ldbedit, but there should be a more

Revision as of 03:46, 20 September 2009

Coin w2k8 to samba4 dc

We've been concentrating up to now on Samba4<->Samba4 replication, and Samba4<->Windows replication where the Samba4 server joins the Windows domain. A more difficult problem is making it work when you start with a Samba4 domain (from provision, or from vampiring a Windows domain) and then try to add another Windows DC by using dcpromo. This is currently failing with an obscure error at the end of the dcpromo process.

Create connection object (nTDSConnection)

Our KCC implementation (in source4/dsdb/kcc) is very simple at the moment. It should work by creating nTDSConnection objects under the nTDSDSA objects in the LDAP tree, then use those to create the repsFrom attributes, and possibly send DsUpdateRefs operations to the other DCs to setup a repsTo on each replication partner.

Right now we don't create nTDSConnection objects at all, which needs to be fixed.

Update to new doc release

We should look through the new WSPP docs release (from August 2009) and see what we haven't implemented yet, forming a more extensive todo list then this one. Now that we have basic replication working we can start to try to get all the corner cases right, and for that the docs (especially MS-DRSR and MS-ADTS) are a good source of information.

Why isn't repsTo written by Windows?

I have noticed that Windows is not sending us a DsUpdateRefs to update the repsTo when we join a Windows domain as a 2nd DC. This means if we followed the correct behaviour we would never send Windows a DsReplicaSync message, so we'd never tell windows to replication to us.

To work around this dreplsrv_notify_check() currently cheats by using repsFrom if repsTo is empty. We need to instead work out why Windows is not sending us DsUpdateRefs messages. Perhaps related to the lack of nTDSConnection objects?

repadmin.exe tool

The repadmin.exe tool on windows is a great way of seeing the status of replication. We would like to get all of the options of repadmin working when directed at a Samba4 DC. Anatoliy is working on making some of the fnuctions work, but there are plenty more to do.

hook delete in repl_meta_data

Right now we just pass delete operations down through the repl_meta_data module to the ldb_tdb backend. That means that deletes are not replicated (as they don't change anything in ReplPropertyMetaData or in the uSNChanged attribute).
We should intercept delete operations and translate them into a combination of a rename to a objected in the "Deleted Objects" tree, along with a modify to add the isDeleted attribute. Then we need to setup the tombstone data in the object, and add a tombstone reaping task that would run once a day to really delete expired tombstone records.

phantom objects

If DRS replication adds a link to a object that doesn't exist we are supposed to create a "phantom" object, which gets filled in later. We are working around that at the moment by delating link creation until then end of the transaction for the replica cycle, but we should also support phantom objects.

Sort objects on disk

Some sysadmins might write scripts that rely on the return order of attributes within objects (eg. objectclass first). We sort objects on add in repl_meta_data.c to cope with this but we don't fix the sorting on modify. That should be fixed.

Speed up replmd_ldb_message_element_attid_sort

The replmd_ldb_message_element_attid_sort function is pretty inefficient. We need to avoid the attribute lookups in the sort comparison function.

Don't allow replication of readonly attribs

We should not allow replication to overwrite readonly attributes. There are other attribute flags we aren't honouring as well. We should check the docs and add support for all the relevent attribute flags.

Support RODC

A RODC (read-only domain controller) is a potentially very useful use case for Samba4. There is quite a lot of changes in replication and attribute filtering that shoud be done when we are a RODC.

Separate gc partition

Right now the gc partition is just an amalgamation of the normal base partitions, with no filtering (we just set the magic control to say that searches should cross partition boundaries).

We need to decide if we should make a separate ldb for the gc partition, and if so what method we will use to keep it in sync. If we don't create a separate partition then we should add the right filtering to gc searches.

If modify sets attrib to same value then no replPropertyMetaData change

A modify via DRS replication that asks for an attribute to change to the same value it already has should be filtered out by repl_meta_data.c so that the replPropertyMetaData attribute is not updated.

Fix error mapping (no FOOBAR, and replmd_replicated_request_werror)

We have lots of code that returns WERR_FOOBAR or NT_STATUS_FOOBAR because we didn't know what error to return. We need to go through these and either work out the correct error code, or if that is hard then at least put a reasonable guess of the right error code along with a TODO comment to check it.

Parentguid fix

We store parentGUID in the object on disk at the moment, whereas we should construct it at runtime when asked for.

Honor attribute replication flag

There is a attribute flag for whether particular attributes should be replicated. We need to check that we get this right.

Double cn fix

When we do a s4<->s4 vampire we end up with the rDN attribute appearing twice on all objects in the new replica. We think this is because we should be filtering the rDN in the getncchanges code, but this needs checking.

check for parent exists in replication add and rename

During replication add and rename we need to check that the destination parent exists.

Handle add where DN exists, but different GUID

We may need to handle the case where a DRS replication comes in for a DN that exists, but with a different GUID. We need to test with windows on how this is handled.

Net commands to query repl status (via DRS?)

We should add net commands for querying the replication status (somewhat like repadmin.exe).

Max number of attributes on objects?

Metze noticed that the WSPP docs specify a maximum object size in AD of around 8k. This seems to translate into a maximum number of attributes that windows accepts. We may need to implement a similar limit to prevent problems with replication s4->windows.

Obey acls on objects

We need to obey the ntSecurityDescriptor on objects in our SAM. This is a large task! Nadya is working on it and hopefully will merge soon.

Fix ldb_add ojbectclass sorting

In ldb_add we sort objectClass attributes in the objectclass module. The sort is currently horribly inefficient - it needs redoing using the sort indexes that Andrew and Nadia have recently added.

-s option to setup_dns.sh

The setup_dns.sh should be redone as a python wrapper so it obeys standard options like -s and can read smb.conf

what triggers initial kcc run on windows after we join a w2k8 DC?

After we join a s4 DC to a windows domain, we've noticed that w2k8 needs to be prompted to run its KCC using "repadmin /kcc". We need to work out why this is needed so we can fix it.

s4<->s4 in make test

We should add the s4<->s4 vampire and replication in make test


Urgent replication

We need to add the urgent bit on replications that have changed critical objects (see the docs for a list). We will probably need to expand @REPLCHANGED to add a uSNUrgent attribute to support this.

Group policies

We are not currently obeying group policies, although we can serve them out to clients. We need to obey the ones that make sense for Samba. For this we need to provide a really easy API to allow any part of Samba to query a group policy, and to auto-update SAMDB with the needed changes.

Linked attributes

We currently accept the w2k8 linked attributes in replication, but when other DCs replicate to us we serve up linked attributes as normal attributes (which is like a downlevel w2k3 does). We should store the full meta data associated with linked attributes in more fields in the extended DN and serve it up in getncchanges.

Add support for ndr64 to wireshark

When watching w2k8-R2 <-> w2k8-R2 interactions, windows chooses NDR64 instead of NDR. We now support NDR64 in Samba, but wireshark doesn't understand it. To allow us to watch traffic between w2k8-R2 boxes we would like wireshark to understand NDR64.

Convert wireshark drsuapi to pidl

The DRSUAPI decoder in wireshark is quite poor. We should redo it using a pidl based parser.

Fix decryption of w2k8 by wireshark (krb5 patch)

Wehn watching w2k8 <-> samba traffic in wireshark we often find that wireshark cannot decrypt some of the traffic. This is due to a bug/limitation in MIT kerberos. Metze has a hack based on LD_PRELOAD that works around this, but we should try to get this into the wireshark svn tree directly.

bitmap32 actually 3264 in samr QueryUserInfo level 16? (netmon bug too)

There seems to be a problem with the QueryUserInfo level 16 and NDR64. The Microsoft netmon 3.3 parser has the same problem as our ndrdump parser. We need to look into how this should be handled.

How does another DC become the FSMO master and RID master

We need to work out how a DC should become the FSMO master and RID master. We can do it now via ldbedit, but there should be a more automated method (perhaps the KCC should do this?)